From the INTERNATIONAL BUREAU 



PCT 

NOTIFICATION OF ELECTION 

(PCT Rule 61.2) 


To: 

Commissioner 

US Department of Commerce 
United States Patent and Trademark 
Office, PCT 

2011 South Clark Place Room 
CP2/5C24 

Arlington, VA 22202 
ETATS-UNIS D'AMERIQUE 

in its capacity as elected Office 


Date of mailing (day/month/year) 
05 April 2001 (05.04.01) 




International application No. 
PCT/GB00/02512 


Applicant's or agent's file reference 
42.73369 


International filing date (day/month/year) 
27 June 2000 (27.06.00) 


Priority date (day/month/year) 
28 June 1999 (28.06.99) 


Applicant 

LEXOW, Preben 



1. The designated Office is hereby notified of its election made: 

| X| in the demand filed with the International Preliminary Examining Authority on: 

25 January 2001 (25.01.01) 



[ZD in 8 not ' ce effectin 9 later election filed with the International Bureau on: 



2. The election [ X| was 

was not 

made before the expiration of 19 months from the priority date or, where Rule 32 applies, within the time limit under 
Rule 32.2(b). 



The International Bureau of W1PO 


Authorized officer 


34, chemin des Colombettes 


Pascal Piriou 


121 1 Geneva 20, Switzerland 




Facsimile No.: (41-22) 740.14.35 


Telephone No.: (41-22) 338.83.38 



Form PCT/IB/331 (July 1992) 



GB0002512 



From the INTERNATIONAL BUREAU 



PCT 

NOTIFICATION RELATING TO PRIORITY CLAIM 

(PCT Rules 26bis.1 and 26bis.2 and 
Administrative Instructions, Sections 402 and 409) 


To: 

JONES, Elizabeth, Louise 
Frank B. Dehn & Co. 
179 queen Victoria Street 
London EC4V 4EL 
ROYAUME-UNI 


Date of mailing (day/month/year) 

08 November 2000 (08. 11 .00) 


Applicant's or agent's file reference 
42.73369 


IMPORTANT NOTIFICATION 


International application No. 
PCT/GB00/02512 


International filing date (day/month/year) 
27 June 2000 (27.06.00) 


Applicant 

COMPLETE GENOMICS AS et al 



The applicant is hereby notified of the following in respect of the priority claim(s) made in the international application. 

1. fX] Correction of priority claim. In accordance with the applicant's notice received on: 27 October 2000 (27.10.00), 

the following priority claim has been corrected to read as follows: 

_ NO 28 June 1999 (28.06.99) 19991325 

O even tnc > u 9 n tne indication of the number of the earlier application is missing. 

fl even though the following indication in the priority claim is not the same as the corresponding indication appearing 
in the priority document: 

2. n Addition of priority claim. In accordance with the applicant's notice received on: , 

the following priority claim has been added: 

| | even though the indication of the number of the earlier application is missing. 

Q even though the following indication in the priority claim is not the same as the corresponding indication appearing 
in the priority document: 

3. [X] As a result of the correction and/or addition of (a) priority claim(s) under items 1 and/or 2, the (earliest) priority date is: 

28 June 1999 (28.06.99) 

4. Q Priority claim considered not to have been made. 

|~~| The applicant failed to respond to the Invitation under Rule 26bis.2(a) (Form PCT/IB/31 6) withi n the prescribed time limit. 

j | The applicant's notice was received after the expiration of the prescribed time limit under Rule 26bis.1(a). 

| | The applicant's notice failed to correct the priority claim so as to comply with the requirements of Rule 4.10. 
The applicant may, before the technical preparations for international publication have been completed and subject to the 
payment of a fee, request the International Bureau to publish, together with the international application, information 
concerning the priority claim. See Rule 26bis.2(c) and the PCT Applicant's Guide, Volume I, Annex B2(IB). 

5. [X] In case where multiple priorities have been claimed, the above item(s) relate to the following priority claim(s): 

NO 28 June 1999 (28.06.99) 19991325 



6. A copy of this notification has been sent to the receiving Office and 
[Xl to the International Searching Authority (where the international search report has not yet been issued). 
[X] the designated Offices (which have already been notified of the receipt of the record copy). 







Authorized officer 


The International Bureau of WIPO 




34, chemin des Colombettes 


R. Chrem 


1211 Geneva 20, Switzerland 


Facsimile No. (41-22) 740.14.35 


Telephone No. (41-22) 338.83.38 


Form PCT/IB/31 8 (July 1998) 


003643845 



1" ' K f: 



— — — For receiving Office use only 
International Application No. 



International Filing Date 



Name of receiving Office and "PCT International Application" 



Applicant's or agent's file reference 

(if desired) (12 characters maximum) • ^S^t^ 



Box No. I TITLE OF INVENTION 

METHOE 




Box No. II APPLICANT j 


Name and address: (Family name followed by given name; for a legal entity, full official 
designation. The address must include postal code and name of country. The country of the 
address indicated in this Box is the applicant 's State (that is, country) of residence if no State 
of residence is indicated below.) 

COMPLETE GENOMICS AS 

P.O. Box 64 

Blindern 

N-0313Oslo 

Norway 


| | This person is also inventor. 
Telephone No. 

Facsimile No. 

Teleprinter No. 


State (that is, country) of nationality. 

NO 


State (that is, country) of residence: 

NO 


This person is applicant r~"~| all designated 1 y j all designated States except | J the United States | j the States indicated in 
for the purposes of: I I States I A j the United States of America | | of America only I I the Supplemental Box 


Box No. Ill FURTHER APPLICANT(S) AND/OR (FURTHER) INVENTOR(S) 


Name and address: (Family name followed by given name; for a legal entity, full official 
designation. The address must include postal code and name of country. The country of the 
address indicated in this Box is the applicant 's State (that is, country) of residence if no State 
of residence is indicated below.) 

LEXOW, Preben 
Bloksbergveien 16 
3132 Husoysund 
Norway 


This person is: 

| | applicant only 

\)C\ applicant and inventor 

| | inventor only (If this check-box 
is marked, do not fill in below.) 


State (that is, country) of nationality: 

NO 


State (that is, country) of residence: 

NO 


This person is applicant 1 | all designated | | all designated States except ("STl the United States ( | the States indicated in 
for the purposes of: 1 1 States | | the United States of America 1 ZZ ' of America only | | the Supplemental Box 


| X| Further applicants and/or (further) inventors are indicated on a continuation sheet. 


Box No. IV AGENT OR COMMON REPRESENTATIVE; OR ADDRESS FOR CORRESPONDENCE 


Jte^ IS agent □ common representative 


Name and address: (Family namefollowed by given name; for a legal entity, full official 
designation. The address must include postal code and name of country.) 

JONES, Elizabeth Louise 

Frank B. Dehn & Co. 

179 Queen Victoria Street 

London 

EC4V 4EL 

GB 


Telephone No. 

+44 20 7206 0600 

Facsimile No. 

+44 20 7206 0700 

Teleprinter No. 


f~ ~~| Address for correspondence: Mark this check-box where no accnt or common representative is/has been appointed and the 
I I space above is used instead to indicate a special address to which correspondence should be sent. 



Form PCT/RO/1 01 (first sheet) (July 1998; reprint January 2000) See Notes to the request form 



PCT 



REQUEST 



The undersigned requests that the present 
international application be processed 
according to the Patent Cooperation Treaty. 



Sheet No. 



Continuation of Box No. Ill FURTHER APPLICANT(S) AND/OR (FURTHER) INVENTOR(S) 


If none of the following sub-boxes is used, this sheet should not be included in the request 


Name and address: (Family name followed by given name; for a legal entity, full official 
designation. The address must include postal code and name of country. The country of the 
add/ ess indicated in this Box is the applicant s State (that is, country) of residence if no State 
of residence is indicated below.) 

JONES, Elizabeth Louise 

Frank B. Dehn & Co. 

1 79 Queen Victoria Street 

London 

EC4V4EL 

United Kingdom 


This person is: 

| y| applicant only 

| | applicant and inventor 

j" [ inventor only (If this check-box 
is marked, do not fill in below.) 


State (that is, country) of nationality: 

GB 


State (that is. country) of residence: 

GB 


This person is applicant | I all designated j | all designated States except j 1 the United States fTTl the States indicated in 

for the purposes of: | | States j | the United States of America | ) of America only |A j the Supplemental Box 


Name and address: (Family name followed by given name; for a legal entity, full official 
designation. The address must include postal code and name of country. The country of the 
address indicated in this Box is the applicant 's State (that is, country) of residence if no State 
of residence is indicated below.) 


This person is. 

| | applicant only 

j | applicant and inventor 

| | inventor only (If this check-box 
is marked, do not fill in below.) 


State (tliat is, country) of nationality: 


State (that is, country) of residence: 


This person is applicant l— | all designated | 1 all designated States except J j the United States I 1 the States indicated in 

for the purposes of: | | States | | the United States of America | | of America only | | the Supplemental Box 


Name and address: (Family name followed by given name; for a legal entity, full official 
designation. The address must include postal code and name of country. The country of the 
address indicated in this Box is the applicant s State (that is, country) of residence if no State 
of residence is indicated below.) 


This person is: 

| | applicant only 

| | applicant and inventor 

I 1 inventor only (If this check-box 
*— ' is marked, do not fill in below.) 


State (that is, country) of nationality: 


State (that is, country) of residence: 


This person is applicant | 1 all designated j 1 all designated States except j 1 the United States | 1 the States indicated in 

for the purposes of: 1 1 States | | the United States of America | | of America only | | the Supplemental Box 


Name and address: (Family name followed by given name; for a legal entity, full official 
designation. The address must include postal code and name of country. The country of the 
address indicated in this Box is the applicant 's State (that is, country) of residence if no State 
of residence is indicated below.) 


This person is: 

| | applicant only 

| | applicant and inventor 

[ | inventor only (If this check-box 
is marked, do not fill in below.) 


State (that is. country) of nationality: 


State (that is, country) of residence: 



This person is applicant I I alt designated I I all designated States except | 1 the United States | 1 the States indicated in 

for the purposes of: I | States | | the United States of America | | of America only | | the Supplemental Box 



I | Further applicants and/or (further) inventors are indicated on another continuation sheet. j 

Form PCT/RO/1 0 1 (continuation sheet) (July 1 998; reprint January 2000) See Notes to the request form 



Sheet No. 3 



Box No.V 



DESIGNATION OF STATES 



The following designations are hereby made under Rule 4.9(a) (mark the applicable check-boxes; at least one must be marked). 
Regional Patent 

0 AP ARIPO Patent: GH Ghana, GM Gambia, KE Kenya, LS Lesotho, MW Malawi, SD Sudan, SL Sierra Leone, SZ Swaziland, 

TZ United Republic of Tanzania, UG Uganda, ZW Zimbabwe, and any other State which is a Contracting State of the Harare 

Protocol and of the PCT MZ Mozambique 
0 EA Eurasian Patent: AM Armenia, AZ Azerbaijan, BY Belarus, KG Kyrgyzstan, KZ Kazakhstan, MD Republic of Moldova, 

RU Russian Federation, TJ Tajikistan, TM Turkmenistan, and any other State which is a Contracting State of the Eurasian Patent 

Convention and of the PCT 

0 EP European Patent: AT Austria, BE Belgium, CH and LI Switzerland and Liechtenstein, CY Cyprus, DE Germany. 

DK Denmark, ES Spain, FI Finland, FR France, GB United Kingdom, GR Greece, IE Ireland, IT Italy, LU Luxembourg, 
MC Monaco, NL Netherlands, PT Portugal, SE Sweden, and any other State which is a Contracting State of the European Patent 
Convention and of the PCT 

0 OA OAPI Patent: BF Burkina Faso, BJ Benin, CF Central African Republic, CG Congo, CI Cote d'lvoire, CM Cameroon 
GA Gabon, GN Guinea, GW Guinea-Bissau, ML Mali, MR Mauritania, NE Niger, SN Senegal, TD Chad, TG Togo, and any 
other State which is a member State of OAPI and a Contracting State of the PCT (if other kind of protection or treatment desired, 
specify on doited line) 

National Patent (if other land of protection or treatment desired, specify on dotted line): 

0 AE United Arab Emirates rg L r 

0 AL Albania 0 LS 

0 AM Armenia 0 LT 

0 AT Austria . and utility, model 0 LU 

0 AU Australia 0 LV 

0 AZ Azerbaijan 0 MA 

0 BA Bosnia and Herzegovina 0 

0 BB Barbados 0 MG Madagascar 

0 BG Bulgaria 0 MK The former Yugoslav Republic of Macedonia 

0 BR Brazil 

H BY Belarus 0 MN Mongolia 

0 CA Canada 0 MW Malawi 



Liberia 

Lesotho 

Lithuania 

Luxembourg 

Latvia 

Morocco 

Republic of Moldova 



0 CH and LI Switzerland and Liechtenstein 



0 CN China 0 NO 

0 CR Costa Rica 0 NZ 

0 CU Cuba 0 PL 

0 CZ Czech Republic and utility model 0 PT 

0 DE Germany 2nd utility model 0 RO 

0 DK Denmark and utility. modeJ 0 RU 

0 DM Dominica 0 SD 

0 EE Estonia and utility, mode] 0 SE 

0 ES Spain 0 SG 

0 FI Finland . an< ? W. m ?* G } 0 si 

0 GB United Kingdom 0 SK 

0 GD Grenada 0 SL 

0 GE Georgia 0 TJ 

0 GH Ghana 0 TM 

0 GM Gambia 0 TR 



0 MX Mexico 



0 HR 
0 HU 
0 ID 
0 IL 
E IN 
0 IS 
0 JP 
0 KE 
0 KG 
0 KP 

0 KR 

0 KZ 

0 LC 

0 LK 



Croatia 0 TT 

Hungary 0 TZ 

Indonesia 0 UA 

Israel 0 UG 

India 0 US 

Iceland 

Japan 0 UZ 

Kenya 0 VN 

Kyrgyzstan 0 YU 



Norway 

New Zealand 

Poland 

Portugal 

Romania 

Russian Federation 

Sudan 

Sweden 

Singapore 

Slovenia 

Slovakia and utility model. . , 

Sierra Leone 

Tajikistan 

Turkmenistan 

Turkey 

Trinidad and Tobago 

United Republic of Tanzania 

Ukraine 

Uganda 

United States of America . . 



Uzbekistan 

Viet Nam 

Yugoslavia 

0 ZA South Africa 

0 ZW Zimbabwe 

and utility.model Check-boxes reserved for designating States which have 

become party to the PCT after issuance of this sheet: 

gj MZ Mozambique 

E . . BZ Belize 



Democratic People's Republic of Korea 



Republic of Korea 

Kazakhstan 

Saint Lucia 
Sri Lanka 



0 QZ .Algeria, 

0 AG Antigua & Barbuda 



Precautionary Designation Statement: In addition to the designations made above, the applicant also makes under Rule 4.9(b) all other 
designations which would be permitted under the PCT except any dcsignation(s) indicated in the Supplemental Box as being excluded 
from the scope of this statement. The applicant declares that those additional designations are subject to confirmation and that any 
designation which is not confirmed before the expiration of 1 5 months from the priority date is to be regarded as withdrawn by the applicant 
at the expiration of that time limit. (Confirmation (includingfees) must reach the receiving Office within the 15-month time limit.) 



Form PCT/RO/I01 (second sheet) (January 2000) 



See Notes to the request form 



4 

Sheet No. ... 



Supplemental Box If the Supplemental Box is not used, this sheet should not be included in the request. 



J. If, in any of the Boxes, the space is insufficient to furnish all the information: in such case, write "Continuation of Box No. ..." 
[indicate the number of the Box] and furnish the information in the same manner as required according to the captions of the Box in which 
the space was insufficient, in particular: 

(i) if more th an two persons are involved as applicants and/or in ventors and no "con tinuation sheet " is available: in such case, write 
"Continuation of Box No. Ill " and indicate for each additional person the same type of information as required in Box No. III. The 
country of the address indicated in this Box is the applicant 'sState (that is, country) of residence ifnoState of residenceis indicated 
below; 

(ii) if in Box No. II or in any of the sub-boxes of Box No. Ill, the indication "th e States indicated in th e Supplemental Box " is checked: 
in such case, write "Continuation of Box No. II" or "Continuation of Box No. Ill" or "Continuation of Boxes No. II and No. Ill" 
(as the case may be), indicate the name of the applicant(s) involved and, next to (each) such name, the State(s) (and/or, where 
applicable, ARIPO, Eurasian, European or OAPI patent) for the purposes of which the named person is applicant; 

(Hi) if in Box No. II or in any of the sub-boxes of Box No. Ill, the inventor or the inventor/applicant is not inventor for the purposes 
of all designated States or for the purposes of the United States of America: in such case, write "Continuation of Box No. II " or 
"Continuation of Box No. Ill" or "Continuation of Boxes No. II and No. Ill" (as the case may be), indicate the name of the 
inventor(s) and, next to (each) such name, the State(s) (and/or, where applicable, ARIPO, Eurasian, European or OAPI patent) for 
the purposes of which the named person is inventor; 

(iv) if, in addition to the agent(s) indicated in Box No. IV, there are further agents: in such case, write "Continuation of Box No. IV" 
and indicate for each further agent the same type of information as required in Box No. IV; 

(v) if, in Box No. V, the name of any State (or OAPI) is accompanied by the indication "patent of addition, " or "certificate of addition, " 
or if, in Box No. V, the name of the United States of America is accompanied by an indication "continuation " or "continuation- 
in-part ": in such case, write "Continuation of Box No. V" and the name of each State involved (or OAPI), and after the name of 
each such State (or OAPI), the number of the parent title or parent application and the date of grant of the parent title or filing 
of the parent application; 

(vi) if, in Box No. VI, there are more than three earlier applications whose priority is claimed: in such case, write "Continuation of 
Box No. 17" and indicate for each additional earlier application the same type of information as required in Box No. 17; 

(vii) if in Box No. 17, the earlier application is an ARIPO application: in such case, write "Continuation of Box No. VI", specify the 
number of the item corresponding to that earlier application and indicate at least one country party to the Paris Convention for 
the Protection of Industrial Property or one Member of the World Trade Organization for which that earlier application was filed. 

2. If, with regard to the precautionary designation statement contained in Box No. V, the applicant wishes to exclude any State(s) from 
the scope of mat statement: in such case, write " Designation (s) excluded from precautionary designation statement" and indicate the 
name or two-letter code of each State so excluded. 

3. If the applicant claims, in respect of any designated Office, the benefits of provisions of the national law concerning non-prejudicial 
disclosures or exceptions to lack of novelty: in such case, write "Statement concerning non-prejudicial disclosures or exceptions to lack 
of novelty " and furnish that statement below. 



CONTINUATION OF BOX NO. IV 

JONES, Elizabeth Louise is applicant in respect of the GB designation only. 



The following, also of Frank B. Dehn & Co., are also appointed as agents: 



Watkins, A.J.; Leale, R.G., Woodman, D.; Skailes, H.J.; Tomlinson, K.J.; Butler, M.J. ; Pett, 
CP.; Cockbain, J.R.M.; Davies, C.R.; Piesold, A.J.; Matthews, D.P.; Dzieglewska, H.E.; 
Calamita, R.; Leckey, D.H.; Hague, A.J.; Towler, P.D.; Hughes, A.M.; Tothill, J.P.; Marsden, 
J.C.; Grant, A.R.; Golding, L.A.; Jackson, R.P.; Jones, E.L.; Hall, M.B.; Stevens, J.P.; Dixon, 
P.M.; Hancox, J.C.; Gardner, R.K.; Jeffrey, P.M.; Beacham, A.R.; Moy, D.; Samuels, A.J.; 
Campbell, N.B. 



Form PCT/RO/101 (supplemental sheet) (January 2000) 



See Notes to the request form 



Sheet No. 



Box No. VI PRIORITY CLAIM 



I I Further priority claims are indicated in the Supplemental Box 



Filing date 
of earlier application 
(day/mon th/year) 



Number 
of earlier application 



Where earlier application is: 



national application: 
country 



regional application:* 
regional Office 



international application 
receiving Office 



item(l) 

27 June 1999 



19991325 



NO 



item (2) 

20 June 2000 



20003190 



NO 



item (3) 

20 June 2000 



20003191 



NO 



( I The receiving Office is requested to prepare and transmit to the International Bureau a certified copy 
— of the earlier application(s) (only if the earlier application was filed with the Office which for the 
purposes of the present international application is the receiving Office) identified above as item(s). 

* Where the earlier application is an ARJPO application, it is mandatory to indicate in the Supplemental Box at least one country party to the Paris 
Convention for the Protection of Industrial Property for which that earlier application was filed (Rule 4. 1 0(b) (it)). See Supplemental Box. 



Box No. VII INTERNATIONAL SEARCHING AUTHORITY 



Choice of International Searching Authority (ISA) 

(if two or more International Searching Authorities are 
competent to carry out the international search, indicate 
the Authority chosen; the two-letter code may be used) ; 

ISA/ 



Request to use results of earlier search; reference to that search (if an earlier 
search has been carried out by or requested from the International Searching Authority): 



Date (day/month/year) 



Number 



Country (or regional Office) 



Box No. VIII CHECK LIST; LANGUAGE OF FILING 



This international application contains 
the following number of sheets: 

request : 5 

description (excluding 
sequence listing part) 

claims 

abstract 

drawings 

sequence listing part 
of description 



77 
6 
1 
5 



Total number of sheets : 94 



This international application is accompanied by the item(s) marked below: 

1 . S fee calculation sheet 

2. Q separate signed power of attorney 

3. □ copy of general power of attorney, reference number, if any: 

4. □ statement explaining lack of signature 

5. □ priority document(s) identified in Box No. VI as item(s): 

6. □ translation of international application into (language): 

7. □ separate indications concerning deposited microorganism or other biological material 

8. □ nucleotide and/or amino acid sequence listing in computer readable form 

9. (3 other (specify)'. Co* PewTX^C* APPU OtTv*>N> (?cT / C^B^ / 



Figure of the drawings which 
should accompany the abstract: 


Language of filing of the 
international application: English 


Box No. IX SIGNATURE OF APPLICANT OR AGENT 



Next to each signahuv. indicate the name of the person signing and the capacity in which the person signs (if such capacity is not obvious from reading the request). 



[O&O 



JONES^yBjzabeth Jones - Professional Representative 



For receiving Office use only « 



1 . Date of actual receipt of the purported 
international application: 



Corrected date of actual receipt due to later but 
timely received papers or drawings completing 
the purported international application: 



Date of timely receipt of the required 
corrections under PCT Article 1 1(2): 



5. International Searching Authority tq a / 
(if two or more are competent): lori / 



□ 



Transmittal of search copy delayed 
until search fee is paid. 



2. Drawings: 
| | received: 

| | not received: 



For International Bureau use only • 



Date of receipt of the record copy 
by the International Bureau: 



See Notes to the request form 



Form PCT/RO/101 (last sheet) (July 1998; reprint January 2000) 



This sheet is not part of and does not count as a sheet of the international application. 



PCT 



FEE CALCULATION SHEET 
Annex to the Request 



Applicant's or agent's 

file reference 42.73369 



Applicant 



For receiving Office use only 



International application No. 



Date stamp of the receiving Office 



COMPLETE GENOMICS AS ET AL 



GBP 605 



CALCULATION OF PRESCRIBED FEES 

1. TRANSMITTAL FEE j GBP 55 

2. SEARCH FEE 

International search to be carried out by _____ 

(!f two or more International Searching Authorities are competent in relation to the international 
application, indicate the name of the Authority.which is chosen to carry out the international search.) 

3. INTERNATIONAL FEE 



Basic Fee 

The international application contains 94 



sheets. 



first 30 sheets | GBP 264 



bl 



64 

remaining sheets 



additional amount 



GBP 384 



b2 



Add amounts entered at bl and b2 and enter total at B . 



BGP 648 



J3 



Designation Fees 

The international application contains 



designations. 



8 



number of designation fees 
payable (maximum 8) 



GBP 56 



amount of designation fee 



= 1 



GBP 448 



Add amounts entered at B and Dand enter total at I 

(Applicants from certain States are entitled to a reduction of 75% of the 
international fee. Where the applicant is (or all applicants are) so entitled, the 
total to be entered at I is 25% of the sum of the amounts entered atB and D.) 

4. FEE FOR PRIORITY DOCUMENT (if applicable) 



I 1096 qfef 



5. TOTAL FEES PAYABLE 

Add amounts entered at T, S, I and P,and enter total in the TOTAL box 



1756 C^&P 



TOTAL 



EH ^ nc designation fees are not paid at this time. 



MODE OF PAYMENT 

□ authorization to charge 
deposit account (see below) 

| X | cheque 

j | postal money order 



| | bank draft 
n Cas * 1 

I I revenue stamps 



I I coupons 

I I other (specify): 



DEPOSIT ACCOUNT AUTHORIZATION (this mode of payment may not be available at all receiving Offices) 
The RO/ j~j is hereby authorized to charge the total fees indicated above to my deposit account. 

□ (this check-box may be marked only if the conditions for deposit accounts of the receiving Office so permit) is 
hereby authorized to charge any deficiency or credit any overpayment in the total fees indicated above to my 
deposit account. 

□ is hereby authorized to charge the fee for preparation and transmittal of the priority document to the International 
Bureau of WIPO to my deposit account. 



Deposit Account No. 



Date (day/month/year) 



Signature 



Form PCT/RO/101 (Annex) (January 2000) 



See Notes to the fee calculation sheet 



The demand must be filed directly with the competent International Preliminary Examining Authority or, if two or more Authorities are competent, 
with the one chosen by the applicant The full name or two-letter code of that Authority may be indicated by the applicant on the line below: 

n> EA /JEPO 



PCT 



CHAPTER II 



DEMAND 

under Article 3 1 of the Patent Cooperation Treaty . 
The undersigned requests that the international application specified below be the subject of 
international preliminary examination according to the Patent Cooperation Treaty and 
hereby elects all eligible States (except where otherwise indicated). 



Identification of IPEA 


Date of receipt of DEMAND 


Box No. 1 IDENTIFICATION OF THE INTERNATIONAL APPLICATION 


Applicant's or agent's file reference 
42.1.73369 


International application No. 
PCT/GB00/02512 


International filing date (day/month/year) 
27 JUNE 2000 (27/06/00) 


(Earliest) Priority date (day/month/year) 
28 JUNE 1999 (28/06/99) 



Title of invention 

Methods of Cloning and Producing Fragment Chains with Readable Information Content 



Box No. II APPLICANT(S) 



Name and address: Ramify name followed by given name; for a legal entity, full official designation. 


Telephone No.: 


The address must include postal code and name of country) 


Complete Genomics AS 




Facsimile No.: 


PO Box 64 Blindern 


N-0313Oslo 




Norway 




Teleprinter No.: 



State (that is, country) of nationality: 



NO 



State (that is, country) of residence: 



NO 



Na me and address : (Family name followed by given name; for a legal entity, full official designation. The address must include postal code and name of country.) 

LEXOW. Preben 
Bloksbergveien 16 
N-3132 Husoysund 
Norway 



State (that is, country) of nationality: 


State (that is, country) of residence: 


NO 


NO 



Name and address: (Family name followed by given name; for a legal entity, full official designation. The address must include postal code and name of country.) 



JONES, Elizabeth Louise 
Frank B. Dehn & Co 
179 Queen Victoria Street 
London ECV4 E4L 
United Kingdom 



State (thai is, country) of nationality: 

GB 


State (that is, country) of residence: 

GB 


| | Further applicants are indicated on a continuation sheet. 
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Box No. Ill AGENT OR COMMON REPRESENTATIVE; OR ADDRESS FOR CORRESPONDENCE 



The following person is | y | agent [ | common representative 

and I X| has been appointed earlier and represents the applicant(s) also for international preliminary examination. 

I I is hereby appointed and any earlier appointment of (an) agent(s)/common representative is hereby revoked. 

| | is hereby appointed, specifically for the procedure before the International Preliminary Examining Authority, in addition to 
the agent(s)/common representative appointed earlier. 



Name and address: (Family name followed by given name; for a legal entity, full official designation. 
The address must include postal code and name of country.) 



JONES, Elizabeth Louise 

Frank B. Dehn & Co. 

179 Queen Victoria Street 

London 

EC4V 4EL 

GB 



Telephone No.: 

+44 20 7206 0600 



Facsimile No.. 

+44 20 7206 0700 



Teleprinter No. . 



□ Address for correspondence: Mark this check-box where no agent or common representative is 'has been appointed and the 
space above is used instead to indicate a special addr ess to which correspondence should be sent. 



Box No. IV BASIS FOR INTERNATIONAL PRELIMINARY EXAMINATION 



Statement concerning amendments: * 

1. The applicant wishes the international preliminary examination to start on the basis of: 
1X1 the international application as originally filed 

the description I 1 as originally filed 

1 1 as amended under Article 34 

the claims | | as originally filed 

f ^2 as amended under Article 1 9 (together with any accompanying statement) 
| | as amended under Article 34 

the drawings 1 | as originally filed 

| | as amended under Article 34 

2. 1 1 The applicant wishes any amendment to the claims under Article 19 to be considered as reversed. 

3. | | The applicant wishes the start of the international preliminary examination to be postponed until the expiration of 20 months 
from the priority date unless the International Preliminary Examining Authority receives a copy of any amendments made 
under Article 19 or a notice from the applicant that he does not wish to make such amendments (Rule 69.1(d)). (This check- 
box may be marked only where the time limit under Article 19 has not yet expired.) 

Where no check- box is marked, international preliminary examination will start on the basis of the international application 
as originally filed or, where a copy of amendments to the claims under Article 19 and/or amendments of the international application 
under Article 34 are received by the International Preliminary Examining Authority before it has begun to draw up a written opinion 
or the international preliminary examination report, as so amended. 



Language for the purposes of international preliminary examination: .{^OflfeJ) 

I * 1 which is the language in which the international application was filed. 

1 1 which is the language of a translation furnished for the purposes of international search. 

1 * 1 which is the language of publication of the international application. 

| | which is the language of the translation (to be) furnished for the purposes of international preliminary examination. 



Box No. V ELECTION OF STATES 



The applicant hereby elects all eligible States (that is, all States which have been designated and which are bound by Chapter II of 
thePCT) 

excluding the following States which the applicant wishes not to elect: 
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International application No. 

PCT/GB00/02512 



Box No. VI CHECK LIST 



The demand is accompanied by the following elements, in the language referred to in 
Box No. IV, for the purposes of international preliminary examination: 



1 . translation of international application 


sheets 


received 

□ 


not received 

□ 


2. amendments under Article 34 


sheets 


□ 


□ 


3. copy (or, where required, translation) of 
amendments under Article 19 


sheets 


□ 


□ 


4. copy (or, where required, translation) of 
statement under Article 1 9 


sheets 


□ 


□ 


5. letter ( . -j 


sheets 


□ 


□ 


6. other (speci/y) 


sheets 


□ 


□ 



For International Preliminary 
Examining Authority use only 



The demand is also accompanied by the item(s) marked below: 

1. [T] fee calculation sheet 

2. | | separate signed power of attorney 

3- I I copy of general power of attorney; 
reference number, if any: 



4. | | statement explaining lack of signature 

5. I I nucleotide and < 
computer readal 

6. | | other (specify): 



5. I I nucleotide and or amino acid sequence listing in 
computer readable form 



Box No. VH SIGNATURE OF APPLICANT, AGENT OR COMMON REPRESENTATIVE 



Next to each signature, indicate the name of the person signing and the capacity in which the person signs (if such capacity is not obvious from reading the demand). 



JONES, Elizabeth Louise - Professional Representative 



For International Preliminary Examining Authority use only « 



1 . Date of actual receipt of DEMAND: 



2. Adjusted date of receipt of demand due 
to CORRECTIONS under Rule 60. 1(b): 



3 I I The date of receipt of the demand is AFTER the expiration of 1 9 months 
■ ' from the priority date and item 4 or 5, below, does not apply. 



□ The applicant has been 
informed accordingly. 



4 I I X h f date of reccipt of ^ c demand is WITHIN the period of 19 months from the priority date as extended by virtue of 
I 1 Rule 80.5. 



5. I I Although the date of receipt of the demand is after the expiration of 19 months from the priority date, the delay in arrival 
I 1 is EXCUSED pursuant to Rule 82. 



For International Bureau use only 



Demand received from IPEA on: 



Form PCT/IPEA/401 (last sheet) (July 1 998; reprint January 2000) 



See Notes to the demand form 



PCT 



CHAPTER II 



FEE CALCULATION SHEET 
Annex to the Demand for international preliminary examination 

I— For International Preliminary Examining Authority use only 



Internationa] 
application No. 



PCT/GB00/02512 



Applicant's or agent's 
file reference 



42.1.73369 



Date stamp of the IPEA 



Applicant 



Complete Genomics AS 



Calculation of prescribed fees 



1. Preliminary examination fee 

2. Handling fee (Applicants from certain States are 
entitled to a reduction of 75% of the handling fee. 
Where the applicant is (or all applicants are) so en- 
titled, the amount to be entered at H is 25% of the 
handling fee.) 

3. Total of prescribed fees 

Add the amounts entered at P and H 

and enter total in the TOTAL box 



EUR 1533 



EUR 147 



H 



EUR 1680 



TOTAL 



Mode of Payment 

r^n authorization to charge deposit 
I J» I account with the IPEA (see below) 

[ I cheque 

[ ""] postal money order 

| | bank draft 



| | cash 

j | revenue stamps 

j~] coupons 

| | other (specify): 



Deposit Account Authorization (this mode of payment may not be available at all IPEAs) 

The IPEA/ EPO is hereby authorized to charge the total fees indicated above to my deposit account. 

rvH (this check-box may be marked only if the conditions for deposit accounts of the IPEA so permit) I is hereby 
L*J authorized to charge any deficiency or credit any overpayment in the total fees indicated above to 
my deposit account. 



28050069 25 January 2001 

Deposit Acc ount Number Date (day/month/year) 

Form PCT/IPEA/401 (Annex) (Jury 1998; reprint January 2000) 



Signature 



See Notes to the fee calculation sheet 



p. 



From the: 

INTERNATIONAL PRELIMINARY EXAMINING AUTHORITY 



PATENT COOPERATIOIO' REATY 

^ | by fax and post 



To: 

JONES, Elisabeth L. 
Frank B. Dehn & CO. 
179 Queen Victoria Street 
London EC4V 4EL 
GRANDE BRETAGNE 




00 ^ 2o ^Zo€ 09-c>0 

PCT 

WRITTEN OPINION 
(PCT Rule 66) 



27.08.2001 



Applicant's or agent's file reference 
42.1.73369 



within 1 month(s) 

from the above date of mailing 



International application No. 
PCT/G BOO/025 12 



International filing date (day/month/year) 
27/06/2000 



Priority date (day/month/year) 
28/06/1999 



International Patent Classification (IPC) or both national classification and IPC 
C12N15/10 



Applicant 

COMPLETE GENOMICS AS et al. 



bus 



1 . This written opinion is the first drawn up by this International Preliminary Examining Authority. 

2. This opinion contains indications relating to the following items: 

I ^ Basis of the opinion 



Lack of unity of invention 

Reasoned statement under Rule 66.2(a)(ii) with regard 
citat ; ons and explanations supporting such statement 



3. The applicant is hereby invited to reply to this opinion. 

When? See the time limit indicated above. The applicant may, before the expiration of that time limit, 
request this Authority to grant an extension, see Rule 66.2(d). 

How? By submitting a written reply, accompanied, where appropriate, by amendments, according to Rule 66.3. 

For the form and the language of the amendments, see Rules 66.8 and 66.9. 

Also: For an additional opportunity to submit amendments, see Rule 66.4. 

For the examiner's obligation to consider amendments and/or arguments, see Rule 66.4 bis. 
For an informal communication with the examiner, see Rule 66.6. 

If no reply Is filed, the international preliminary examination report will be established on the basis of this opinion. 

4. The final date by which the international preliminary 

examination report must be established according to Rule 69.2 is: 28/1 0/2001 . 



II 


□ 


III 


□ 


IV 




V 


0 


VI 




VII 


□ 


VIII 


□ 



Name and mailing address of the international 
preliminary examining authority: 

European Patent Office 
/flj)) D-80298 Munich 
23'' Tel. +49 89 2399 - 0 Tx: 523656 epmu d 
Fax: +49 89 2399 - 4465 



Authorized officer / Examiner 
Barnas, C 



Formalities officer (incl. extension of time limits) 
Hingel, W 

Telephone No. +49 89 2399 8717 




Form PCT/IPEA/408 (cover sheet) (January 1994) 



WRITTEN OPINION 



International application No. PCT/G BOO/0251 2 



I. Basis of the opinion 

1 . With regard to the elements of the international application (Replacement sheets which have been furnished to 
the receiving Office in response to an invitation under Articie 14 are referred to in this opinion as "originally filed"): 

Description, pages: 

1-77 as originally filed 

Claims, No.: 

1 -28 as originally filed 

Drawings, sheets: 

1-6 as originally filed 

Sequence listing part of the description, pages: 
1-23, filed with the letter of 5.9.2000 

2. With regard to the language, all the elements marked above were available or furnished to this Authority in the 
language in which the international application was filed, unless otherwise indicated under this item. 

These elements were available or furnished to this Authority in the following language: , which is: 

□ the language of a translation furnished for the purposes of the international search (under Rule 23.1 (b)). 

□ the language of publication of the international application (under Rule 48.3(b)). 

□ the language of a translation furnished for the purposes of international preliminary examination (under Rule 
55.2 and/or 55.3). 

3. With regard to any nucleotide and/or amino acid sequence disclosed in the international application, the 
international preliminary examination was carried out on the basis of the sequence listing: 

□ contained in the international application in written form. 

□ filed together with the international application in computer readable form. 
H furnished subsequently to this Authority in written form. 

H furnished subsequently to this Authority in computer readable form. 

S The statement that the subsequently furnished written sequence listing does not go beyond the disclosure in 
the international application as filed has been furnished. 

H The statement that the information recorded in computer readable form is identical to the written sequence 
listing has been furnished. 

4. The amendments have resulted in the cancellation of: 
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□ 



the description, 
the claims, 
the drawings, 



pages: 



□ 



Nos.: 



□ 



sheets: 



5. □ This report has been established as if (some of) the amendments had not been made, since they have been 

considered to go beyond the disclosure as filed (Rule 70.2(c)): 

(Any replacement sheet containing such amendments must be referred to under item 1 and annexed to this 
report.) 

6. Additional observations, if necessary: 
IV. Lack of unity of invention 

1 . In response to the invitation (Form PCT/I PEA/405) to restrict or pay additional fees, the applicant has: 

□ restricted the claims. 

paid additional fees. 

□ paid additional fees under protest. 

□ neither restricted nor paid additional fees. 

2. □ This Authority found that the requirement of unity of invention is not complied with for the following reasons 

and chose, according to Rule 68.1 , not to invite the applicant to restrict or pay additional fees: 

3. Consequently, the following parts of the international application were the subject of international preliminary 
examination in establishing this opinion: 

ES all parts. 

□ the parts relating to claims Nos. . 

V. Reasoned statement under Rule 66.2(a)(ii) with regard to novelty, inventive step or industrial applicability; 
citations and explanations supporting such statement 

1. Statement 

Novelty (N) Claims 1 -6, 8, 11 -1 6, 1 8, 1 9, 25, 26 

inventive step (IS) Claims 10, 17, 27 

Industrial applicability (IA) Claims 

2. Citations and explanations 
see separate sheet 
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VI. Certain documents cited 

1. Certain published documents (Rule 70.10) 
and / or 

2. Non-written disclosures (Rule 70.9) 
see separate sheet 
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Re Item I 

Basis of the opinion 

Sequence listing pages 1-.23 filed with the letter of 5.9.2000 do not form part of the 
application (Rule 1 3 ter . 1 (f) PCT). 

Re Item V 

Reasoned statement under Rule 66.2(a)(ii) with regard to novelty, inventive step or 
industrial applicability; citations and explanations supporting such statement 

D1 : VERMERSCH P S ET AL: THE USE OF A SELECTABLE FOK-I CASSETTE IN DNA REPLACEMENT 

MUTAGENESIS OF THE R388 DIHYDROFOLATE REDUCTASE GENE* GENE (AMSTERDAM), vol. 

54, no. 2-3, 1987, pages 229-238, XP002149816 ISSN: 0378-1 119 
D2: BRAKE A J ET AL: 'ALPHA-FACTOR-DIRECTED SYNTHESIS AND SECRETION OF MATURE 

FOREIGN PROTEINS IN SACCHAROMYCES-CEREVISIAE' PROCEEDINGS OF THE NATIONAL 

ACADEMY OF SCIENCES OF THE UNITED STATES, vol. 81, no. 15, 1984, pages 4642-4646, 

XP002149815 1984 ISSN: 0027-8424 
D3: MANDECKI W ET AL: 'FOK-I METHOD OF GENE SYNTHESIS' GENE (AMSTERDAM), vol. 68, no. 

1, 1988, pages 101-107, XP0021 4981 7 ISSN: 0378-1 119 
D4: WO 98 38326 A (ZINK MARY ANN ;XU GUOPING (US); HODGSON CLAGUE P (US); NATURE TECH) 

3 September 1998 (1998-09-03) 

1. Art. 33(2) PCT, Novelty 

1.1. D1 (Fig. 4) discloses a cloning strategy identical to the method of claim 1 wherein 

the "fragment of the first nucleic acid molecule" is the Fokl fragment derived from 
pPV9134-sup with SS1a being AAGC, 

"the second nucleic acid molecule" is pPV9124-sup with SS2 being CGCA and 

the "adapter molecule" is the synthetic duplex or synthetic oligonucleotide with SSA1 being 
TTCG and SSA2 being TGCG. The synthetic oligonucleotide (adapter) contains a Ball 
restriction site. 

D1 is, therefore, novelty destroying for claims 1, 4-6, 8, 11-13, and 15. 
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1 .2. Said cloning strategy of D1 is also identical to the method of claim 1 6 wherein n=2 
and the double strand nucleic acid fragments of step 1) are the Fokl fragment derived from 
pPV9134-sup and the synthetic oligonucleotide with the complementary regions AAGC 
(Fokl) and TTCG (oligonucl.). Both of the fragments contain code elements encoding for 
amino acids. 

D1 is, therefore, also novelty destroying for claims 16, 19, 25 and 26. 

1 .3. D2 (Fig. 2ef.) discloses a method wherein a part of the EGF gene derived by Hgal 
digestion (fragment of the' first nuclei acid molecule, SS1a: ACTCT, SS1b: AACTC) is 
cloned into a vector (second nucleic acid molecule, SS2: TCGA) via linkers 3 and 4 
(=adapters; SSA1: TGAGA and SSA2: TCGA of linker 4; SS1b: CATG of linker 3). D2 is, 
therefore, novelty destroying for claims 1-6, 8, and 11-16, 19, 25 and 26. 

1.4. D3 (p. 105, paragraph "Design of ..." and Figs. 1 and 4.) discloses a method of 
synthesizing a nucleic acid wherein six DNA fragments containing genetic code elements 
are ligated. All six fragments have single stranded (ss) regions at both termini. The 5' ss 
region of fragment 1 is generated by BamHI digestion, the 3' ss region of fragment six by 
Hindlll digestion. The other ss regions which represent five complementary pairs have 
been generated by Fokl digestion. Said method is novelty destroying for claims 16, 19, 25 
and 26. 

1.5. Also D4 (p. 32, In. 25 - p. 33, In. 1) discloses a method wherein linear DNA is 
synthesized by ligating 512 Hga1 fragments. Said method is novelty destroying for claims 
16, 18, 25 and 26. 

2. Art. 33(3) PCT, Inventive Step 

2.1 . Claim 1 of D4 describes a method wherein at least three nucleic acid molecules (= 
two or more first nucleic acid molecules and one or more second nucleic acid molecules) 
are cleaved and ligated. Claim 4 of D4 says that adapters should not be used for the 
ligation. The use of adapters for the ligation is thereby made obvious. The ends of the 
cleaved nucleic acid molecules are complementary to only one other overhangig end (see 
claim 1a). The ends of the adaptors have, therefore, the same specificity. It would, 
therefore, be clear for the skilled person that the two or more first nucleic acid molecules 
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and one or more second nucleic acid molecules can be bound by adapters simultaneously 
in the same reaction. Claim 10 is, therefore, not inventive. 

2.2. The method of claim 16 is known as outlined above. It would be clear for the skilled 
person that said method can also be applied to ligate short fragments. The skilled person 
would, therefore, according to the circumstances, use said method to ligate fragments as 
described in claim 17. Claim 17 is, therefore, not inventive. 

2.3. The method of D3 is described a general method of synthesizing nucleic acid 
molecules. The skilled person would, therefore, apply said method by ligating 10 or more 
fragments. Claim 18 is, because of this reason, also not inventive over D3. 

2.4. Claim 27 embraces Southern Blot hybridisation of a labelled probe complementary 
to the membrane linked known DNA from D1-D3 containing genetic code elements. Such 
a hybridisation is, however, a routine method and claim 27 is, therefore, not inventive. 



Re Item VI 

Certain documents cited, Certain published documents (Rule 70.10) 

Application No Publication date Filing date Priority date (valid claim) 

Patent No (day/month/year) (day/month/year) (day/month/year) 

WO 00/39333 6.7.2000 23.12.1999 13.12.1998 



The above listed document was published and filed after the priority date of the present 
application. It does, therefore, not belong to the state of the art according to Rule 64(1 )(b) 
PCT. However, said document claims priority dates earlier than that of the present 
application (28.6.1999). If this priority is valid, the document will become of relevance for 
the novelty of the subject matter of the present application during regional phase 
examination at the EPO. 

The applicant is requested to file new claims and/or explanations which take account of 
the above comments. The attention of the applicant is drawn to the fact that the application 
may not be amended in such a way that it contains subject-matter which extends beyond 
the content of the application as filed, Art. 34 (2) PCT. Therefore, the applicant is asked 
to indicate the basis of any amendments to the claims in the application documents 
originally filed. 
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From the INTERNATIONAL BUREAU 



PCT 

INFORMATION CONCERNING ELECTED 
OFFICES NOTIFIED OF THEIR ELECTION 

(PCT Rule 61.3) 


To: 

JONES, Elizabeth, Lot 
Frank B. Dehn & Co. 
179 Queen Victoria St 
London EC4V 4EL 
ROYAUME-UNI 


iis&__^.... 

4 O AOL Viim 

1 Z Ark £UU] 

RECEIVED 


Date of mailing (day/month/year) 
05 April 2001 (05.04.01) 




ANSB fa- rr-q 


Applicant's or agent's file reference 
42.73369 


IMPORTANT 1 


NFORMATION' 


International application No. International filing date (day/month/year) Priority date (day/month/year) 
PCT/GB00/02512 27 June 2000 (27.06.00) 28 June 1999 (28.06.99) 


Applicant 

COMPLETE GENOMICS AS et al 



1. The applicant is hereby informed that the International Bureau has, according to Article 31(7), notified each of the following 
Offices of its election: 

AP :GH,GM,KE,LS,MW,MZ,SD,SL,SZ/rZ,UG^W 

EP :AT,BE / CH / CY,DE / DK / ES / FI,FR,GB / GR / IEJT / LU / MC / NL / PT / SE 

National :AU / BG # CA / CN # CZ / DEJL # JP / KP # KR / MN r NO # NZ / PL # RO / RU r SE,SK / US 



The following Offices have waived the requirement for the notification of their election; the notification will be sent to them 
by the International Bureau only upon their request: 

EA :AM / AZ,BY,KG / KZ / MD / RU / TJ / TM 

OA rBF^J.CF^CG^CUCM^A^N^GW^UMR^^SNTD^TG 

National :AE / AG / AL / AM / AT / AZ f BA / BB,BR / BY / BZ,CH,CR / CU,DK / DM / DZ / EE / ES / FI / GB / 
GD/3E,GH/3M,HR,HUJDJNJS,KE,KG^^ 

M^MZ.PT^SD^G^LSUTJTMTR^TT.TZ.UA^G.U^VN^U^ZW 

The applicant is reminded that he must enter the "national phase" before the expiration of 30 months from the priority date 
before each of the Offices listed above. This must be done by paying the national fee(s) and furnishing , if prescribed, a 
translation of the international application (Article 39(1)(a)), as well as, where applicable, by furnishing a translation of any 
annexes of the international preliminary examination report (Article 36(3)(b) and Rule 74.1). 

Some offices have fixed time limits expiring later than the above-mentioned time limit. For detailed information about the 
applicable time limits and the acts to be performed upon entry into the national phase before a particular Office, see Volume II 
of the PCT Applicant's Guide. 

The entry into the European regional phase is postponed until 31 months from the priority date for all States designated for 
the purposes of obtaining a European patent.' 



The International Bureau of WIPO 
34, chemtn des Colombettes 
1211 Geneva 20, Switzerland 



Facsimile No. (41-22) 740.14.35 



Authorized officer: 



PascaJJBLriou 

Telephone No. (41-22) 338.83.38 




Form PCT/IB/332 (September 1997) 



3947751 



ATENT COOPERATION TR#TY 



WO 01/00816 
PCT/G BOO/025 12 



From the INTERNATIONAL BUREAU 



PCT 



NOTICE INFORMING THE APPLICANT OF THE 
COMMUNICATION OF THE INTERNATIONAL 
APPLICATION TO THE DESIGNATED OFFICES 

(PCT Rule 47.1(c), first sentence) 



Date of mailing (day/month/year) 
04 January 2001 (04.01.01) 



Applicant's or agent's file reference 
42.73369 



To: 




JONES, Elizabeth, Louise 


Frank B. Dehn & Co. 


179 Queen Victoria Street 


London EC4V 4EL 

in i 1 1 ii ii i ii ir 










1 2 JAN 2001 | 











a 



International application No. 
PCT/GB00/02512 



International filing date (day/month/year) 
27 June 2000 (27.06.00) 



Priority date (day/month/year) 

28 June 1999 (28.06.99) 



Applicant 



COMPLETE GENOMICS AS et al 



1. Notice is hereby given that the International Bureau has communicated, as provided in Article 20, the international application 
to the following designated Offices on the date indicated above as the date of mailing of this Notice: 

AG,AU,BZ,DZ,KP,KR,MZ,US 

In accordance with Rule 47.1(c), third sentence, those Offices will accept the present Notice as conclusive evidence that 
the communication of the international application has duly taken place on the date of mailing indicated above and no copy 
of the international application is required to be furnished by the applicant to the designated Office(s). 

2. The following designated Offices have waived the requirement for such a communication at this time: 

AE^AM^AT^BA^B^CBR^Y^CH^N^R^U^^DE^^DM^A^E^^ES^^GB^D, 
GE^H^M^HR^U.IDJLJNJSJ^K^KG^Z^C^LK^R^S^T^U^^M^MD^G^^MN^W^X, 

NO^OA^PT^RUSD^E^G^LSK^TJJ^ . +K 

The communication will be made to those Offices only upon their request. Furthermore, those Offices do not require the 
applicant to furnish a copy of the international application (Rule 49.1 (a-bis)). 

3. Enclosed with this Notice is a copy of the international application as published by the International Bureau on 
04 January 2001 (04.01.01) under No. WO 01/00816 

REMINDER REGARDING CHAPTER II (Article 31(2)(a) and Rule 54.2) 

If the applicant wishes to postpone entry into the national phase until 30 months (or later in some Offices) from the priority 
date, a demand for international preliminary examination must be filed with the competent International Preliminary 
Examining Authority before the expiration of 19 months from the priority date. 

It is the applicant's sole responsibility to monitor the 19-month time limit. 

Note that only an applicant who is a national or resident of a PCT Contracting State which is bound by Chapter II has the 
right to file a demand for international preliminary examination. 

REMINDER REGARDING ENTRY INTO THE NATIONAL PHASE (Article 22 or 39(1)) 

If the applicant wishes to proceed with the international application in the national phase, he must, within 20 "2Q ntns 
or 30 months, or later in some Offices, perform the acts referred to therein before each designated or elected Office. 

For further important information on the time limits and acts to be performed for entering the national phase, see the 
Annex to Form PCT/IB/301 (Notification of Receipt of Record Copy) and Volume II of the PCT Applicant's Guide. 







Authorized officer 




The International Bureau of WIPO 






34, ch min des Colombettes 


J. Zahra 




121 1 G neva 20, Switzerland 






Facsimile No. (41-22) 740.14.35 


Telephone No. (41-22) 338.83.38 





Form PCT/IB/308 (July 1996) 



3744408 



WO 01/00816 
PCT/G BOO/025 12 

Continuation of Form PCT/IB/^T 

O NOTICE INFORMING THE APPLICANT OF THE COMMUNICATION OF 

THE INTERNATIONAL APPLICATION TO THE DESIGNATED OFFICES 



Date of mailing (day/month/year) 
04 January 2001 (04.01.01) 


IMPORTANT NOTICE 


Applicant's or agent's file reference 
42.73369 


International application N . 
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BY FACSIMILE 



Dear Sirs 



International Patent Application No. PCT/GB00/02512 in the name of 
Complete Genomics AS et al 

I refer to the Written Opinion which issued on this case to which a response is due by the 
extended deadline of 12 October 2001. 

An amended claim set is transmitted herewith. Duplicate copies follow with the confirmation 
copy of this letter. In this claim set claims 1 to 16 and 19 have been deleted. Claim 20 has 
been made new claim 1 by incorporating the subject matter of claims 16 and 19. Claims 21 and 
22 have similarly been converted into new claims 2 and 3. Remaining sub-claims and other 
independent claims 17, 18 and 23 to 28 have been reordered accordingly. New claim 14 
concerns a kit using the library of revised claim 13 and is based on the teaching in the 
specification in the passage bridging pages 43 and 44, particularly page 44, lines 16 to 23. For 
convenience, a hand-amended copy of the claims is enclosed. Furthermore, the amendments 
which have been made are tabulated in the attached schedule. 

My comments on the objections raised by the Examiner are as follows: 

The Examiner made no novelty objections to claims 7, 9, 10, 17, 20-24, 27 and 28. The claims 
have now been amended to comprise only the subject matter of claims 20 - 28 (and sub-claims 
17 and 18). These claims, excluding claims 18, 25 and 26 were previously found novel by the 
Examiner and thus the revised claims comprising that subject matter should similarly be found 
novel. Claim 18 is a sub-claim and should similarly be considered novel. Claims 25 and 26 
have been amended in commensurate scope with the other claims such that they now refer to 
the novel methods and molecules of preceding claims. As such it is submitted that these claims 
are similarly novel. New claim 14 contains the library of claim 13 and should similarly be 
considered novel. 
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With regard to inventive step, no objections were raised by the Examiner against claims 7, 9, 20- 
24 and 28. Thus claims in the amended claim set based on the subject matter of those claims 
should similarly be considered inventive, ie. claims 1-9 and 13. Claim 10 is based on original 
claim 25 which was considered to lack an inventive step. However this claim has now been 
restricted to incorporate the methods of preceding claims which are inventive and thus this 
method claim should also be considered inventive. Similarly claims 11 and 12 (based on 
original claims 26 and 27) refer to methods of preceding claims and the products produced by 
them and thus similarly should be considered inventive. New claim 14 contains the inventive 
library of claim 13 and again should be considered inventive. 

In summary, the methods now described concern methods of synthesizing double stranded 
nucleic acid molecules in which fragments comprising code elements are linked together. The 
code elements are defined as deriving from alphanumeric (claim 1) or binary (claim 2) code or 
have a particular formula dictated by 4 to 10 nucleotides (claim 3). None of the cited 
documents are concerned with the generation of nucleic acid molecules containing code. The 
Examiner cites naturally occurring sequences which by their very nature provide a genetic code. 
Such code is however excluded in the amended claims by the restrictions to the type of code 
(alphanumeric or binary) or the length of the code element (4 to 10 oligonucleotides). None of 
the cited documents suggest the generation of code containing molecules. The naturally 
occurring code present in nucleic acid molecules represents at most an accidental anticipation 
(now removed from the scope of the claims) which does not suggest in any way that a nucleic 
acid molecule containing a code without biological meaning should be created. As a 
consequence, the aspect of the invention to which the claims are now directed should be 
considered to be both novel and inventive. As mentioned above, the Examiner did not 
previously raise any patentability objections to this subject matter. 

Sub-claims 4 to 10 should similarly be considered novel and inventive by virtue of their 
dependence on claims 1 to 3. Claim 1 1 is directed to nucleic acid molecules prepared according 
to the novel and inventive method and similarly should be considered novel and inventive. 

Claim 12 concerns the identification of code elements in a molecule prepared in accordance 
with the methods of the invention. In the Written Opinion, the Examiner referred to the 
inventiveness of this claim (previously claim 27) being compromised by Southern blot 
hybridisation of a labelled probe complementary to the DNA which contained genetic code 
elements. However, by virtue of the limitation in previous method claims, the product of 
methods of the invention is nucleic acid molecules having particular code elements which are 
not (for the reasons stated above) genetic code elements. Thus methods of identifying code 
from molecules prepared by inventive methods and containing artificial code is similarly novel 
and inventive. 

The library of fragments for use in the inventive method (claim 13), to produce the inventive 
products of the invention should similarly be considered inventive as the motivation to produce 
such a library could only be derived with knowledge of the method or the intended product, 
which was not known or suggested in the prior art. Similarly kits containing the inventive library 
should be considered novel and inventive. 
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It is hoped that the above described amendments and comments will overcome the objections 
raised by the Examiner and I therefore look forward to the issue of a positive IPER. 

Please acknowledge receipt of this letter by return of the EPO Form 1037 enclosed with the 
confirmation copy of this letter. 

Yours faithfully, 
Frank B. Dehn & Co. 



Elizabeth Jones 



Enc. 
ej 
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URGENT - IMMINENT PCT 
PUBLICATION 



Dear Sirs 



International Patent Application No. PCT/GBOO/02512 in the name of 

Complete Genomics AS - Request for correction of Request form under Rule 91-1 PCT 

I hereby request correction of an obvious error in the Request form filed on this case. 

The error appears in the priority claim which was made. Please note however that the date of 
27 June 1999 (the filing date given for application No. 19991325) has already been corrected 
under Rule 26bis.l(a) and a copy of the communication from the International Bureau 
confirming this correction is enclosed. 

The error for which correction is sought is in the filing dates of Norwegian Patent Applications 
Nos. 20001390 and 20001391 which is currendy given as 20 June 2000, but which should be 
corrected, in both cases to read 28 June 1999. The correct filing date is clear on inspection of 
those documents. Furthermore, the relevant filing dates are provided for those applications in 
the priority claim as it stands, even though those filing dates are erroneously matched with the 
wrong application number. As such, all the relevant details of the priority applications and their 
filing dates are provided in the original priority claim and the correct numbers could readily be 
married with the correct application numbers. It is therefore submitted that the priority claim 
comprises an obvious error which should be correctable. 

In this respect I enclose herewith a copy of the front page of the certification of Norwegian 
Patent Application Nos. 20003190 and 20003191 (which have the filing date of 28 June 1999) 
which describes the origin of these applications. Taking first Application No. 20003190, you 
will note that additional documentation was filed on Application No. 19991325 (document B) 
on 28 June 1999. This became Application No. 20003190 which has a filing date of 20 June 
2000 but was accorded a filing date of 28 June 1999. The second document Application No. 
2000319 1 concerns the other portion of the material filed on 28 June 1999 and similarly 
although filed on 20 June 2000 was given the filing date that the subject matter was first filed, 
i.e. 28 June 1999. 
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Correction of the priority claim is therefore requested such that the following priority is claimed 
Norwegian Patent Application No. 20003190 filed on 28 June 1999 and 
Norwegian Patent Application No. 20003191 filed on 28 June 1999. 

In the event that correction of this obvious error is refused, I propose to request that the 
International Bureau publish this request for rectification under Rule 91.1(0 PCT in the 
publication of this application which is due to occur on 4 January 2001. In view of the 
imminence of the completion of the technical preparations for publication which I am informed 
will be around 15 December 2000, your immediate attention to this matter would be 
appreciated. A copy of this letter has been forwarded to the International Bureau. 

I look forward to hearing from you regarding the above-mentioned rectification. 

Yours faithfully, 
Frank B. Dehn & Co. 



Elizabeth Jones 



Enc. 
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dokument er noyaktig utskrift/kopi av 
ovennevnte seknad, som opprinnelig 
inngitt 2000.06.20 



// is hereby certified that the annexed 
document is a true copy of the above- 
mentioned application, as originally filed 
on 2000.06.20 



Patent application no 19991325 marked *A" was filed with the Norwegian Patent Office on 
1999.03.18. Documents marked "B" were received by this Office in that application on 
1999.06.28. Those documents were made subject of a separate patent application no 20003191 
marked "C which was actually filed with this Office on 2000.06.20 but which under Section 23 of 
the Patents Regulations shall be considered as field on 1999.06.28. 



Documents marked "C" are true copies of documents filed on 2000.06.20. 
Documents marked **B W are true copies of documents filed on 1999.06.28. 
Documents marked "A" are true copies of documents filed on 1999.03.18. 
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METHODS OF CLONING AND PRODUCING FRAGMENT CHAINS WITH READABLE INFORMATION 
CONTENT 



The present invention relates to new methods of 
attaching first and second nucleic acid molecules, 
particularly methods of cloning in which adapter 
molecules mediate the binding between the first and 
second molecules, the resultant nucleic acid molecules 
thus formed and methods of generating DNA with a readily- 
readable information content and kits for performing 
such methods . 

Presently known cloning methods generally involve 
the use of restriction enzymes which are used to 
generate fragments for insertion and cleave vectors to 
produced corresponding and hence complementary terminal 
sequences. Generally, the enzymes which are used cut 
palindromic sequences and thus produce identical 
overhangs. Different sequences that are cut with the 
same restriction endonucleases can then be ligated 
together to form new, recombinant nucleic acids. 

However, such methods suffer from a number of 
limitations. One disadvantage in using endonucleases 
that form two identical overhangs is the formation of 
different products on ligation. If for example two 
fragments A and B are to be ligated, as a consequence of 
common overhangs the products A+A and B+B as well as the 
desired A+B will be produced. Other by-products 
resulting from other fragments produced when A and B 
were formed will also be generated, e.g. reassociation 
into the original positions. It is therefore normal to 
use a separation process using agarose gels. The 
separation procedure however often results in a 
considerable loss of DNA. 

Such methods necessarily suffer from various 
limitations including the by-products mentioned above, 
and the need to identify the desired end-products, e.g. 
if only a particular insert is to be cloned. 

Other cloning techniques have been used in which 
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cloning has been performed using PCR techniques, e.g. in 
which the PCR primers have IIS enzyme recognition sites. 
However, the use of PCR is disadvantageous in cloning 
techniques as it is time consuming and requires 
5 purification steps which result in significant loss of 

yield. The PCR reaction may also introduce point 
mutations and the like and the length of the fragment is 
limited to the polymerase capacity, e.g. a maximum of 
approximately 50kb. 

10 It has now surprisingly been found that by 

generating fragments with unique single stranded regions 
and then mediating the binding between a first and 
second nucleic acid molecule, many of these 
disadvantages may be avoided. In this method, 

15 restriction nucleases are used that form non- identical 
overhangs, e.g. type IP or IIS restriction 
endonucleases . As will be appreciated, if one uses a 
restriction endonuclease that makes overhangs of 4 base 
pairs, each fragment that is formed will have two 

20 overhangs of 4 base pairs each. It is theoretically 

possible therefore that 4 8 (ie. 65,536) fragments may be 
formed with different combinations of the two overhangs. 
Thus, as a rule, each fragment formed on cleavage will 
have a unique pair of overhangs even when cleaving large 
/ 25 nucleic acid molecules. 

These unique overhangs may then be addressed and 
adjusted appropriately using adapters with two 
overhangs . For example in a cloning technique one of 
the overhangs is made to correspond to the overhang on 

30 the insert and the other overhang is made to correspond 
to the overhang on the vector into which the insert is 
to be introduced. This method is outlined in Figure 1. 
In that case the DNA molecule containing the insert is 
cut with a restriction endonuclease which makes an 

35 overhang on each side of the insert. Each of the many 

fragments which are formed have different overhangs such 
that the two overhangs at either end of the insert are 
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unique. Ligase is then added to bind two adapters with 
corresponding single stranded regions. This leads to 
the formation of two new overhangs at the termini of the 
insert, which are selected such that they can be used to 
5 bind to the vector into which the insert is to be 

cloned. Providing identical overhangs are not created 
on other molecules only the desired insert will be 
ligated to the adapters. In the final step the insert 
is ligated into the vector which has two overhangs which 

10 complement the adapters' overhangs. The overhangs in 

the vector may be constructed using the same principles 
as described for the insert. 

Thus in this new method, an adapter molecule is 
used which is complementary to a single stranded region 

15 generated on the first nucleic acid molecule and 

therefore binds to that molecule, but has a different 
single stranded region at its other terminus, thus 
effectively modifying the single stranded region 
presented for binding by the first nucleic acid molecule 

20 fragment. The adapter's free single stranded region may 
then mediate the binding of the first nucleic acid 
molecule fragment to a second nucleic acid molecule 
exhibiting a complementary single stranded region. 
This method of mediation has particular 
/ 25 applications for effectively identifying and selecting a 
first nucleic acid molecule fragment and then mediating 
its binding to a second nucleic acid molecule where this 
was not previously possible. 

Of particular relevance to methods of cloning is 

3 0 the generation of fragments for cloning which have 
different single stranded regions at their termini 
relative to other fragments, which may then be selected 
and cloned into an appropriate vector. As described 
herein, such fragments are generated by the use of 

3 5 enzymes which cleave outside their recognition site and 
thus produce overhangs that depend on the sequence 
surrounding the recognition site which is likely to vary 
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from fragment to fragment. 

Such techniques may be used to direct only a single 
fragment to a particular vector or may be used to direct 
different fragments to different sites or indeed 
5 different vectors, even within the same reaction mix, 
providing appropriate adapters are constructed. 

These methods have particular advantages over prior 
art methods. In particular, the whole procedure may be 
carried out in one or two steps, e.g. cutting and 

10 ligating simultaneously or cutting and ligating 

separately. Even in instances where the procedure is 
performed in two steps, it will often be possible to 
perform both steps in the same buffer, e.g. since T4 DNA 
ligase is known to work well in most buffers for 

15 restriction endonucleases . Time- and resource -consuming 
precipitation procedures may therefore be avoided. 
Moreover, ligations can be performed with overhangs of 
4-6 bases, unlike • conventional cloning where overhangs 
of 0-4 bases are used, thereby increasing ligation 

20 efficiency considerably. 

Furthermore, the need to carry out gel separations 
may be avoided. The quantity of DNA required initially 
can be reduced substantially. Mutation of DNA molecules 
on UV exposure, a common occurrence in gel separation, 

25 may also be avoided. Furthermore, laboratory staff are 
not exposed to carcinogenic EtBr. Also, separation 
problems which can occur when restriction cleavage 
results in fragments of similar size may be avoided. 
The frequency of undesirable side-products such as empty 

3 0 vectors, too many inserts or incorrect orientation of 
the inserts may also be avoided . 

Since it is generally not problematic if the insert 
is cleaved, a small selection, e.g of type IIS or Ip 
restriction endonucleases could provide far more cloning 

35 possibilities than a corresponding selection of ordinary 
type II restriction endonuclease used for conventional 
cloning procedures. Having a few type IIS, IP and 
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similar restriction endonucleases that cleave with high 
frequency allows for many cloning possibilities. 

In the specific instance of cloning of large DNA 
molecules (e.g. genomic DNA) or a solution containing 
many different DNA molecules in parallel (e.g. a cDNA 
library) it is very difficult to use conventional 
methods. If for example a large DNA molecule is cleaved 
with EcoRl, a large number of fragments may be formed 
with the same overhang, and in addition a considerable 
proportion of these fragments may be of roughly the same 
size. This may lead to the formation of a large number 
of undesired ligation products, even with gel 
separation. Moreover, gel separation can be difficult 
if the insert is large. Furthermore, it is also often 
difficult, or even impossible, to find restriction 
endonucleases that will not cut large inserts. These 
problems may be reduced/eliminated using the cloning 
procedure described herein. 

If necessary, it is possible to increase the number 
of base pairs in the overhangs to (e.g.) 6 by using Cjel 
or similar endonucleases to form an even greater number 
of possible variables and thus increase the probability 
of producing unique overhangs . 

The advantages of the method of the invention are 
even greater in complex cloning procedures. If several 
adapters are used for example, it is possible to clone 
many different inserts into one and the same vector at a 
corresponding number of different sites in one and the 
same reaction, as described hereinafter in more detail. 

Deletions of small or large fragments may also be 
achieved using the same basic principle . This opens up 
the possibility of making complex recombinations of 
inter alia genomic DNA (removal of endogen viruses in 
genomes to be used for xenotransplantation, the 
insertion of a large number of genes from other genomes, 
new combinations of genes etc.). The method can also be 
used for exon-shuf fling and other recombinations that 
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are relevant in connection with artificial evolutionary 
systems . 

Thus, in a first aspect, the present invention 
provides a method of attaching a fragment of a first 
5 nucleic acid molecule to a second nucleic acid molecule, 
wherein said method comprises at least the steps: 

1) cleaving said first nucleic acid molecule with a 
nuclease which has a cleavage site separate from its 
recognition site to create at least one fragment of said 

10 first nucleic acid molecule having a single stranded 
nucleotide region (SSla) at at least one terminus of 
said fragment , 

2) if necessary generating a single stranded 
nucleotide region (SS2) at at least one terminus of said 

15 second nucleic acid molecule, 

3) binding to at least one single stranded region of 
step 1) (SSla) an adapter molecule comprising at one 
terminus a single stranded region (SSA1) complement airy 
to the single stranded region of said first nucleic acid 

20 molecule fragment (SSla) and additionally comprising at 
the other terminus a further single stranded region 
(SSA2) complementary to the single stranded region (SS2) 
at one terminus of said second nucleic acid molecule, 

4) ligating said adapter to said first nucleic acid 
-25 fragment, 

5) . binding said adapter to said second nucleic acid 
molecule , and 

6) ligating said adapter to said second nucleic acid 
molecule . 

30 

As used herein, said first and second nucleic acid 
molecules are any naturally occurring or synthetic 
polynucleotide molecules, e.g. DNA, such as genomic or 
cDNA, PNA and their analogs, which are double stranded 
35 and in which single stranded regions may be generated. 

Fragments of the first nucleic acid molecule are 
generated by use of a nuclease which cleaves outside its 
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recognition site. One or more fragments may be 
generated depending on the sites which are cleaved (e.g. 
if the site is at the extreme end of the molecule only a 
few bases may be removed rather than the production of 2 
fragments) . Other nucleic acid molecule fragments 
described herein may be generated by any appropriate 
means, as mentioned herein, including the techniques 
used to produce the first nucleic acid molecule 
fragments. Fragments are preferably more than 10 bases, 
e.g. 10 to 200bp, preferably more than 100 bases in 
length. For cloning applications, fragments having 
lengths in excess of 200 bases, e.g. from 200 bases to 
2kb may be used. Where longer single stranded regions 
are generated, fragments of longer lengths are also 
contemplated, e.g. 10-100kb or longer. 

"Single stranded regions" as referred to herein are 
regions of overhang at the end, ie . at the terminus of 
the first, second or third nucleic acid molecules or 
adapter molecules. These regions are sufficient to 
allow specific binding of molecules having complementary 
single stranded regions and subsequent ligation between 
these molecules. Thus, the single stranded regions are 
at least 1 base in length, preferably 3 bases in length, 
but preferably at least 4 bases, e.g. from 4 to 10 
bases, e.g. 4, 5 or 6 bases in length. Single stranded 
regions up to 2 0 bases in length are contemplated which 
will allow the use of fragments in the method of the 
invention which are up to Mb in length. 

"Binding" as used herein refers to the step of 
association of complementary single stranded regions 
(ie. non-covalent binding) . Subsequent "ligation" of 
the sequences achieves covalent binding. 

"Complementary" as used herein refers to specific 
base recognition via for example base-base 
complementarity. However, complementarity as referred 
to herein includes pairing of nucleotides in Watson- 
Crick base -pairing in addition to pairing of nucleoside 
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analogs, e.g. deoxyinosine which are capable of specific 
hybridization to the base in the nucleic acid molecules 
and other analogs which result in such specific 
hybridization, e.g. PNA, DNA and their analogs. 
5 Complementarity of one single stranded region to another 
is considered to be sufficient when, under the 
conditions used, specific binding is achieved. Thus in 
the case of long single stranded regions some lack of 
base-base specificity, e.g. mis-match, may be tolerated, 

10 e.g. if one base in a series of 10 bases is not 

complementary. Such slight mismatches which do not 
affect the ultimate binding and ligation of the single 
stranded regions are considered to be complementary for 
the purposes of this invention. The single stranded 

15 regions may retain portions, on binding, which remain 

single stranded, e.g. when overhangs of different sizes 
are employed or the complementary portions do not 
comprise all of the single stranded regions. In such 
cases, as mentioned above, providing binding can be 

20 achieved the single stranded regions are considered to 
be complementary. In those cases, prior to ligation, 
missing bases may be filled in e.g. using Klenow 
fragment, or other appropriate techniques as necessary. 
"Adapters" as referred to herein are molecules 
/ 25 which adapt the first nucleic acid molecule fragment for 
binding to a second or third nucleic acid molecule. 
Adapter molecules comprise at least two regions . A 
first portion containing a single stranded region which 
is complementary to the single stranded region on the 

3 0 first nucleic acid molecule fragment and a second 

portion containing a single stranded region which is 
complementary to the single stranded region on the 
second nucleic acid molecule. The single stranded 
regions are as described hereinbefore and are preferably 

35 on different strands making up the adapter molecule. 

The above mentioned portions are at least as large as 
the single stranded regions, e.g. 4 to 6 bases in 
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length, although they may be longer, e.g. up to 2 0 bases 
in length. 

A linking region between these single stranded 
regions is required for the stability of the molecule. 
5 Conveniently this comprises a double stranded nucleic 
acid fragment, especially in methods of cloning where 
amplification, replication and/or translation are to be 
performed. However, this portion may be substituted by 
any appropriate molecule depending on the end use of the 

10 resulting ligated molecule. Clearly, to achieve 

ligation between the first and second nucleic acid 
molecules appropriate attachment points and moieties for 
ligation must be provided. 

The linking portion may serve more than just a 

15 linking function and may for example provide sequences 
appropriate for primer or probe binding, e.g. for 
amplification or identification, respectively, or may 
contain integration sites for mobile elements such as 
transposons and the like. Depending on how the method 

2 0 is performed, the adapters preferably do not contain 

restriction sites for any restriction enzymes used in 
the method of the invention thus avoiding the need to 
inactivate or remove the enzymes prior to the addition 
of the adapters . 
25 Conveniently adapter molecules may be exclusively 

comprised of a nucleic acid molecule in which the 
various properties of the adapter are provided by the 
different regions of the adapter. 

Conveniently adapters are made up of two 

3 0 complementary oligonucleotides having between 10 and 100 

bases each, e.g. between 20 and 50 bases. 

In the method described above, preferably at least 
one first nucleic molecule fragment is generated having 
a single stranded region at either end (SSla and SSlb) 
3 5 to each of which an adapter binds. 

Preferably the method described herein is used for 
cloning. Thus, in the method described above, an 
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adapter is bound at either end of the first nucleic acid 
molecule fragment (in which the adapters may be the same 
of different) , and the unbound end of the first adapter 
is bound to the second nucleic acid molecule and the 
5 unbound end of the second adapter binds either to the 
second nucleic acid molecule (ie. at the other end 
distal to the binding of the first adapter, thereby 
forming a circular molecule) or binds to a third nucleic 
acid molecule. The first of these two alternatives may 
10 arise through cleavage of a circular vector to give rise 
to the second nucleic acid molecule to which the 
[adapter 1] : [first nucleic acid molecule 

fragment] : [adapter 2] insert is bound to re-circularize 
the vector. Alternatively, a linear or circular vector 

15 may be cleaved giving rise to two or more discrete 
fragments (herein the second and third nucleic acid 
molecules) which may be joined by the adapter 1 : first 
nucleic acid molecule : adapter 2. 

Thus, in a preferred feature, a first nucleic acid 

20 molecule fragment is generated which has a single 

stranded nucleotide region at either terminus (SSla and 
SSlb) , each of which is bound by an adapter, which may 
be the same or different , and the first of said adapters 
is bound to said second nucleic acid molecule and the 

25 second of said adapters binds either to said second 
nucleic acid molecule or to a third nucleic acid 
molecule . 

Thus, alternatively stated, in a preferred 
embodiment, the present invention provides a method of 
30 cloning a fragment of a first nucleic acid molecule into 
a second nucleic acid molecule, wherein said method 
comprises at least the steps: 

1) cleaving said first nucleic acid molecule with a 
nuclease which has a cleavage site separate from its 
3 5 recognition site to create one or more fragments of said 
first nucleic acid molecule, wherein at least one 
fragment has a single stranded nucleotide region at both 
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termini (SSla and SSlb) , 

2) cleaving said second nucleic acid molecule to 
create at least two single stranded regions (SS2a and 
SS2b) at the site of said cleavage (e.g. linearizing a 
circular vector or producing fragments in a linear or 
circular vector) , 

3) binding to one of the single stranded regions of 
step 1) (SSla) 

a first adapter molecule comprising at one terminus 
a single stranded region (SSA1) complementary to 
the single stranded region of said first nucleic 
acid molecule fragment (SSla) and additionally 
comprising at the other terminus a further single 
stranded region (SSA2) complementary to one of the 
single stranded regions (SS2a) produced by cleavage 
of said second nucleic acid molecule, and 
binding to a second single stranded region of step 1) 
(SSlb) 

a second adapter molecule as defined above which 
binds to the second single stranded region of said 
first nucleic acid molecule fragment (SSlb) and to 
the second single stranded region (SS2b) produced 
by cleavage of said second nucleic acid molecule, 

4) ligating said adapters to said first nucleic acid 
fragment , 

5) binding said, adapters to said second nucleic acid 
molecule or fragments thereof, and 

6) ligating said adapters to said second nucleic acid 
molecule or fragments thereof . 

In instances in which cleavage of the second 
nucleic acid molecule results in the production of two 
or more discrete fragments which become ligated to the 
first nucleic acid molecule fragment via the adapters, 
said fragments constitute second and third nucleic acid 
molecules of the invention. 

Preferably, to prevent concatermirisat ion of 
[adapter : first nucleic acid fragment : adapter] units, the 



WO 01/00816 



PCT/GB00/02512 



- 12 - 

single stranded region of the second and third nucleic 
acid molecules which bind to these adapters are not 
complementary. Thus, for example, where cloning into a 
vector is performed, preferably said vector is 
5 linearized and at least of portion of said vector is 

removed from one terminus of that vector, e.g. at least 
two cleavage events occur. 

In such methods, particularly for cloning, the 
second nucleic acid molecule, e.g. into which a first 
10 nucleic acid molecule fragment is inserted is 

conveniently a vector (or a part thereof, e.g. where the 
second and third nucleic acid molecules together 
comprise the vector, and result through its cleavage) . 
Such vectors include any double stranded nucleic acid 
15 molecule which may be linear or circular. (However, as 
mentioned above in respect of the adapters, providing 
single stranded regions exist, or are generated at the 
termini of the second nucleic acid or its fragments 
(e.g. the vector) , the adjacent regions may be made up 
20 of any molecule providing ligation at the termini to the 
adapters is not compromised.) 

Conveniently such vectors may contain sequences 
which aid their use in methods of the invention or their 
subsequent manipulation. Thus, vectors are conveniently 
, 25 selected with only two or a small number of restriction 
cleavage sites for the method of cleavage used. Thus 
for example where restriction enzymes are used, the 
vector is selected to include only a minimal number, 
preferably only two recognition sites to that enzyme. 
30 Vectors may additionally comprise further portions 

or sequences for cloning, selection, amplification, 
transcription or translation as appropriate. Thus 
vectors may be used with probe or primer sites, promoter 
regions, other regulatory regions, e.g. expression 
3 5 control sequences etc. Conveniently well-known cloning 

vectors are employed, such as pBR3 22 and derived 
vectors, pUC vectors such as pUC19, lambda vectors, BAC, 
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YAC and MAC vectors and other appropriate plasmids or 
viral vectors . 

The molecule of which a fragment is to be inserted, 
ie. the first nucleic acid molecule, may be any molecule 
5 which can generate single stranded regions at at least 
one of its ends using the nucleases described herein, 
although the central portion may be varied as 
appropriate. Preferably however such molecules are 
double stranded nucleic acid molecules and contain 

10 appropriate sites for the use of enzymes to create the 
single stranded overhangs which are required in 
accordance with the invention. Appropriately, the first 
nucleic acid molecule is derived from genomic DNA and 
the method of the invention is used to insert fragments 

15 thereof into appropriate vectors. 

Adapters which may be used include short double 
stranded nucleic acid molecules with single stranded 
regions at their termini to longer molecules which may 
contain further sequences for example to allow selection 

20 as described hereinafter. Appropriate single stranded 
regions are selected on the basis of the terminal 
sequence of the first, second and third nucleic acid 
molecules or fragments thereof. Appropriate selection 
may also be used to direct the orientation of the 

25 insert, e.g. to produce clones which may be used to 
produce antisense nucleic acid molecules. 

Adapters may be used in the methods of the 
invention in which their single stranded overhangs have 
already been generated, e.g. by the combination of 

3 0 single stranded complementary oligonucleotides which on 
hybridization leave overhangs at either ends, or by 
appropriate cleavage or digestion. 

Alternatively, during the method of the invention, 
adapters may be modified to provide single stranded 

35 portions, e.g. by the use of restriction enzymes or 

other appropriate techniques during the course of the 
reaction. Conveniently, to simplify the number of 
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steps, the enzymes used to generate single stranded 
regions in the first, second or third nucleic acid 
molecules (where necessary) may be used to generate the 
adapter single stranded regions. 

As mentioned previously, the single stranded region 
may be 4 or more bases in length. When using longer 
overhangs or where the sequence of the full 
corresponding single stranded region of the first, 
second or third nucleic acid molecules is not known or 
unclear, a family of adapters with one or more 
degenerate bases in the single stranded region may be 
used, for example using methods to create libraries of 
adapters. Degenerate bases may also be used at 
positions prone to mis-match ligations. 

For convenience a universal library of adapters may 
be created for use in the method of the invention. Thus 
for example, 16 different adapters with a 4 base-pair 
overhang consisting of two random bases (NN) and two 
bases specific to each adapter {e.g. AA, CC, . . .TT) may 
be created. In this way sufficient adapters may be 
created which are capable of distinguishing between 16 
different first molecule fragment overhangs, which would 
suffice for many cloning purposes. Similarly a library 
of second molecule, e.g. vector overhangs may be 
created . 

To increase the number of permutations in an 
adapter library, two separate oligonucleotide libraries 
may be generated, one with single stranded 
oligonucleotides with regions that will correspond to 
the single stranded region of the first nucleic acid 
molecule fragment and the second library with single 
stranded oligonucleotides with regions that will 
correspond to the single stranded region of the second 
nucleic acid molecule (e.g. vector) . However in common 
in each member of the library is a complementary region, 
such that when one member from the first library is 
selected and combined with a member of the second 
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library, they will hybridize leaving free the relevant 
single stranded regions. Thus for example to generate 
an adapter with an AA overhang and a TC overhang to bind 
to the first and second nucleic acid molecules 
respectively, members of the different libraries such as 
GGGCCCCNNAA may be combined with TCNNNCCGGGG to form: 
GGCCCCCNNAA, 
TCNNNCCGGGG 

which exhibits the appropriate overhangs. When using 
only two 16 member libraries this allows the production 
of 256 different adapters. 

In generating appropriate adapters conveniently the 
amount of mis -match which needs to be tolerated when 
binding to overhangs on first, second and/or third 
nucleic acid molecules should be reduced. This may 
conveniently be achieved by selecting oligonucleotides 
on the basis of the probability of a mismatch ligation 
being generated. A computer program for achieving this 
is described in more detail in Example 6. This method 
allows sets of oligonucleotides to be identified which 
can be used to construct chains with more than 100 
fragments in a single ligation cycle but with very low 
levels of mis-match. Thus in a further feature the 
present invention provides computer software adapted to 
identify adapter molecules for use in the method of the 
invention. 

As mentioned above, the production of fragments of 
said first nucleic acid molecule is achieved using a 
nuclease which has a cleavage site separate from its 
recognition site. In so doing, unique overhangs are 
created which reflect the sequence of that molecule. In 
a preferred feature, said nuclease is a class IP or IIS 
restriction enzyme or functional derivatives thereof. 
Such enzymes include enzymes produced synthetically 
through the fusion of appropriate domains to arrive at 
enzymes which cleave at a site distal to their 
recognition site. 
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These enzymes exhibit no specificity to the 
sequence that is cut and they can therefore generate 
overhangs with all types of base compositions. Cleavage 
with IIS enzymes result in overhangs of various lengths, 
5 e.g. from -5 to +6 bases in length. Preferably for 
performing the method of the invention, enzymes are 
chosen which generate 3-6, e.g. 4 base pair overhangs. 
Preferred enzymes for use in the invention include 
enzymes which produce 4 base overhangs at the 3' end: 

10 BstXI; 5 base overhangs at the 3' end: AZoI, Bael, BplI, 
Bsp24I; 6 base overhangs at the 3* end: Cjel, CjePI, 
HaelV; 4 base overhangs at the 5' end: Acelll, Acc3GI, 
Alw26I, AlwXX, Bbrll, BJbsI, BJbvI , BbvII, Bvbl6II, 
Bli 73 61, Bpil, BpuAI , Bsal , Bsc91I, BseKI, BseXI , BsinAI , 

15 BsmBI , BsiriFI, Bso31I, Bsp423I, BspBS31I, BspIS4I, 

BspLUllIII, BspMI, BspST5I, BspTS514I, Bstl2I, Bst71I, 
BstBS32I, BstGZ53I, BstTS5I, BstOZ616I, BstPZ418I, 
Eco31I, BcoA41, Bco044I, Esp3 I , Fokl , Phal, SfaNI, 
Sthl32I, StsI; and 5 base overhangs at the 5 1 end: Hga.1 

20 Over 100 classes of IIS restriction endonucleases 

have been identified and there are large variations both 
with respect to substrate specificity and cleaving 
pattern. In addition, these enzymes have proved to be 
well suited to "module swapping" experiments so that one 

25 can create new enzymes for particular requirements 

(Huang-B, et al . ; J- Protein -Chem. 1996, 15(5):481-9, 
Bickle, T.A.; 1993 in Nucleases (2nd edn) , Kim- YG et 
al.;PNAS 1994, 91:883-887). In these experiments the 
binding domain of transcription factor Spl was merged 

30 with the cleavage domain of Fokl to construct a class 
IIS restriction endonuclease that makes a 4 -base 
overhang with Spl sites . In other experiments a class 
IIS restriction endonuclease that cuts outside the 
binding sites of transcription factor Ultrabithorax was 

3 5 generated. Corresponding experiments have been 

conducted on class I enzymes. By merging the N-terminal 
part of the hsdS sub-unit of StyR 1241 (which recognizes 
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GAAN 6 RTCG) with the C-terminal part of the hsdS sub-unit 
of StyR 1241 (which recognizes TCAN 7 RTTC) a new enzyme 
that recognizes the sequence GAAN 6 RTTC was constructed. 
Several other experiments have been carried out with 
similar success. Unlike in the case of ordinary class 
II enzymes, it is therefore reasonable to assume that a 
number of new IIS and IP restriction enzymes can be 
constructed and adapted to cloning requirements that may 
arise in the future. Very many combinations and 
variants of these enzymes can therefore be used 
according to the principles described herein. 

Generation of the single stranded regions on said 
first nucleic acid fragment may be achieved directly by 
cleavage of said first nucleic acid molecule with 
nucleases described herein without the development of 
intermediate molecules. This forms a preferred feature 
of the invention. Alternatively, indirect and more 
elaborate techniques may be used. For example, the 
first nucleic acid molecule or a fragment thereof may be 
"trimmed" using the nucleases described herein, in which 
linker molecules which carry the nuclease recognition 
site are bound to the first nucleic acid molecule or 
fragment thereof, and cleavage outside the recognition 
site results in cleavage within the first nucleic acid 
molecule or fragment thereof. This method is 
particularly useful since it takes advantage of the fact 
that T4 DNA ligase (and also other ligases) works well 
in most buffers used for restriction cutting. Ligation 
and cleavage can therefore be performed simultaneously 
in the same solution. Furthermore, this methods allows 
the generation of a unique overhang when the overhang 
generated by the first cleavage step is not unique. 

The trimming procedure may be initiated using an 
"initiation linker" that is addressed to an overhang on 
the first nucleic acid molecule or fragment thereof, 
e.g. after cleavage with one or more restriction 
endonucleases as described herein. As used herein, a 
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"linker" refers to a molecule which is similar to an 
"adapter" as described herein, except that the linker 
need only contain one single stranded region to allow 
binding to the molecule to be trimmed. Furthermore, the 
initiation linker contains one or more cleavage sites 
for nucleases that cleave outside their own recognition 
sequence, as described herein, for example Bpll. The 
first nucleic acid molecule or fragment thereof should 
preferentially not contain cleavage sites for the IIS 
enzymes (s) used for the trimming procedure. Such 
cleavage sites may alternatively be inactivated prior to 
the trimming procedure (e.g. by methylation) . 

Propagation linkers (if used) and a termination 
linker (wherein the latter may be an adapter as 
described herein), T4 DNA ligase and the IIS enzyme (s) 
used for the trimming may be added together with the 
initiation linker. Once the initiation linker has been 
ligated into position, cleavage may be effected 
resulting in the generation of an overhang within the 
first nucleic acid molecule or fragment thereof. If 
desired (ie. if further trimming is required), a 
propagation linker containing degenerate overhangs may 
be used to ligate with the overhang which has been 
generated. Since the linker will also carry an 
appropriate nuclease recognition site, cleavage will 
again produce a further cleavage site further upstream 
into the first nucleic acid molecule or fragment 
thereof. This process will continue until an overhang 
is generated that is complementary to one of the 
overhangs in the termination linker (or adapter as 
described herein) . This final linker will not itself 
have the nuclease recognition site and will therefore 
terminate trimming. As mentioned previously, this 
terminator linker may have an appropriate single 
stranded region for binding to the adapter used in the 
next step, or may itself be the adapter. An appropriate 
technique for performing the trimming method may be 
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found in Examples 4 and 9 . 

The trimming method is preferably not performed 
with IIS enzymes belonging to the Bcgl class (e.g. BplI, 
Bael etc.) as the proteins are combined methylases and 
5 endonucleases and the methylase function may inactivate 
the binding sites on propagation linkers. Enzymes 
including Fokl, Hgral etc. are therefore preferred 
enzymes for performing this method. If Bcgl class 
enzymes are to be used, the cof actor AdoMet should be 

10 replaced with AdoHcy, Sinefungine or other cofactors 
that can not function as methyl donors . 

Thus in a preferred feature the invention provides 
a method of removing the end terminus of a double 
stranded nucleic acid molecule with at least one single 

15 stranded region, comprising at least the steps of (i) 

binding (ie. ligated) a double stranded linker molecule 
containing a recognition site for a nuclease which 
cleaves outside its recognition site and a single 
stranded region complementary to the single stranded 

20 region on said double stranded nucleic acid molecule to 
said molecule and cleaving using said nuclease, thereby 
resulting in removal of one or more bases (e.g. 3-10, 
which may be in single or double stranded form, or a 
combination thereof) from the terminus of said nucleic 

25 acid molecule, (ii) optionally binding one or more 

propagation linkers which contain a recognition for a 
nuclease as described above and a degenerate single 
stranded region which binds to the overhang generated by 
the first or subsequent cleavage steps and cleaving 

30 using said nuclease, and (iii) adding a termination 
linker which binds to the single stranded region 
generated in steps i or ii . 

A similar technique may be used to remove unwanted 
sequences, e.g. contributed by the adapter after 

35 ligation of the first nucleic acid molecule fragment and 
second (or third) nucleic acid molecules. Various 
techniques may be used to remove the unwanted sequences, 
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e.g. if the sequence (e.g. a region from the adapter) 
contains a plant transposon sequence/ this may be 
removed by adding necessary transposase enzymes to 
excise that sequence. Alternatively, the unwanted 
5 sequence may be removed by taking advantage of nuclease 
that cleave outside their recognition site. Thus, for 
example, adapters may be used which contain recognition 
sites for such enzymes which on cleavage (by appropriate 
selection of cleavage site sequences) , result in 

10 overhangs generated at two distinct cleavage sites which 
are complementary and thus allow concomitant excision of 
the intervening sequence . Examples of techniques for 
removing intervening sequences are shown in Example 
5 . It will be appreciated that depending on the 

15 nuclease employed, it may be necessary to inactivate 

sites for that enzyme at locations other than adjacent 
to or within the intervening sequence . 

Thus, in a further preferred feature, adapters as 
used herein, additionally comprise one or more nuclease 

20 recognition and cleavage sites whereby arrangement of 
said sequences allows, on cleavage, generation of 
complementary single stranded regions wherein each one 
of said pair of single stranded regions is generated by 
cleavage at a distinct site* 

25 Depending on how the different steps in the method 

of the invention are performed, as described 
hereinafter, where necessary the second nucleic acid 
molecule, and/or the adapters may also be cleaved or 
digested to provide appropriate single stranded regions . 

30 In a preferred feature, the second nucleic acid molecule 
and/or the adapters are cleaved using the nucleases 
described above for generating the first nucleic acid 
molecule fragments. However, instead of cleavage with 
such nucleases, to generate appropriate single stranded 

35 regions and/or fragments from the second or third 
nucleic acid molecules or adapters, alternative 
techniques may be used. Thus for example other 
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restriction enzymes, non-specific nucleases or 
appropriate exonucleases or mechanical methods such as 
sonication or vortexing may be used. Where enzymes are 
employed, small volumes are preferably used during the 
5 reactions to increase efficiency. 

Ligation between the adapters and first, second and 
third nucleic acid molecules is achieved by any 
appropriate technique known in the art (see for example, 
Sambrook et al., in "Molecular Cloning: A Laboratory 

10 Manual", 2nd Ed., Editor Chris Nolan, Cold Spring Harbor 
Laboratory Press; 1989) . For example, ligation may be 
achieved chemically or by use of appropriate naturally 
occurring ligases or variants thereof. Appropriate 
ligases which may be used include T4 DNA ligase, and 

15 thermostable ligases, such as Pfu, Taq, and TTH DNA 
ligase. Ligation may be prevented or allowed by 
controlling the phosphorylation state of the terminal 
bases e.g. by appropriate use of kinases or 
phosphatases. Appropriately large volumes may also be 

20 used to avoid intermolecular ligations. Thus, high 

adapter to vector/insert ratios may be used to avoid the 
vector or insert religating into its source material. 

Other techniques may be used to avoid or remove 
vectors which become religated or which do not cleave. 
/ 25 For example the insert may be cloned into a selection 
marker that destroys the host bacteria unless it has 
been inactivated by the insert. Alternatively 
restriction cleaving using restriction enzymes specific 
for the fragment removed from the vector may be 

30 performed after the ligation step. Religated and 

uncleaved vectors would be cleaved in this step. Thus, 
the ideal cloning site is therefore one which contains 
many unique restriction sites that are removed upon 
insert ligation. Alternatively well-known techniques 

3 5 may be used for identifying the desired product, e.g. 
gel separation . 

If the steps of cleavage and ligation are performed 
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together, advantageously the insert and the vector into 
which it is inserted do not contain binding sites for 
the nuclease used. Similarly, it is advantageous if the 
fragment removed from the vector during the process of 
cloning contains binding sites for the nuclease. In 
that case, if that fragment religates with the vector it 
would be cleaved and thereby removed again. 

Once the first and second nucleic acid molecules 
(and optionally third nucleic acid molecules) or 
fragments thereof have been covalently attached, where 
necessary selection of appropriate products from any 
side -products may be performed. Selection may be 
performed by any techniques known in the art . 
Conveniently however, labelled probes may be used to 
identify sequences present only in the correct product, 
e.g. by probing for one or more sequences formed only 
through the union of the correct sequences, e.g. a probe 
directed to the junction between the adapter and the 
first, second or third nucleic acid sequences. 
Alternatively, the correct ligation may be detected by 
functional properties bestowed on the product through 
ligation, e.g. through the completion of sequences which 
allow expression of a particular product once the vector 
has been cloned into an appropriate host. 

Alternatively, selection may be performed by sequencing 
of the products which have been obtained, e.g. after 
amplification and/or transformation. 

Appropriate labels include any moieties which 
directly or indirectly allow detection and/or 
determination through the generation of a signal. 
Although many appropriate examples exist, examples 
include for example radiolabels, chemical labels (e.g. 
EtBr, TOTO, YOYO and other dyes) , chromophores or 
fluorophores (e.g. dyes such as fluorescein and 
rhodamine) , or reagents of high electron density such as 
ferritin, haemocyanin or colloidal gold. Alternatively, 
the label may be an enzyme, for example peroxidase or 
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alkaline phosphatase, wherein the presence of the enzyme 
is visualized by its interaction with a suitable entity, 
for example a substrate. 

As mentioned previously, one of the significant 
5 advantages which this method offers over known methods 
is the simplification of the techniques which are 
required. The steps described herein may be performed 
sequentially in separate tubes (e.g. when different 
enzymes are used and cross-reaction is undesirable) or 

10 in a limited number of steps. However, ideally, the 
reaction is performed in a single step. This can be 
achieved by appropriate selection of enzymes, adapters 
and second/third nucleic acid molecules, e.g. vectors. 
Thus for example the first nucleic acid molecule 

15 may be fragmented using a particular nuclease which is 
also used to fragment the second nucleic acid molecule. 
Since the enzyme used will cleave outside its 
recognition site, it would be expected that the 
resulting single stranded regions found on both the 

20 first and second nucleic acid molecule fragments will be 
unrelated. However, by appropriate choice of the 
mediating adapters (which may also be added providing 
they do not have restriction sites for that enzyme, or 
that cleavage at those sites reveals appropriate single 

25 stranded regions) , these unrelated sequences may be 

linked via the intermediacy of the adapters. Thus the 
entire reaction may be performed in a single step. 

It will also be appreciated that the adapters may 
be used to address the first nucleic acid fragments to 

3 0 different second nucleic acid fragments or cleavage 
sites. This would therefore allow different first 
nucleic acid molecule fragments to be directed and 
ligated to a particular vector or site within a vector. 
Thus multiple vectors (and corresponding appropriate 

3 5 adapters) may be used simultaneously and take up a 
single first nucleic acid molecule fragment. 

Alternatively, multiple fragments or copies of the 
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same fragment could be inserted at different sites 
within the same vector (in the latter case by the use of 
adapters with one common end but with the other end 
exhibiting variability to allow it to bind to different 
sites within the vector) . In a further alternative, the 
first nucleic acid molecule fragments could be captured 
in the reverse orientation (again by appropriate adapter 
choice) and inserted into a vector, e.g. to produce 
antisense strands . 

Thus in a preferred embodiment the method described 
herein is performed in a single step. The ligation 
steps (ie. adapter to first nucleic acid molecule 
fragment and final ligation) may however be conducted 
separately once association of the relevant molecules 
has been achieved. In a further preferred embodiment, 
the invention provides a method of simultaneously 
attaching two or more fragments of the first nucleic 
acid molecule to different second nucleic acid molecules 
(or different termini thereof) . In cloning, this 
equates to the introducing of the two or more fragments 
into different sites in said second nucleic acid 
molecules or into different second nucleic acid 
molecules, e.g. into different sites within a vector or 
into different vectors. 

Thus the present invention provides methods of the 
invention in which two or more fragments of the first 
nucleic acid molecule are attached to different second 
and optionally third nucleic acid molecules, or 
different termini thereof. In a preferred feature, 
methods are provided wherein one or more fragments of 
said first nucleic acid molecule are attached via 
adapters to single stranded regions in said second 
nucleic acid molecule resulting from different cleavage 
events. As a further preferred feature, methods are 
provided wherein one or more fragments of said first 
nucleic acid molecule are attached via adapters to 
single stranded regions in two or more second nucleic 
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acid molecules. 

It will be appreciated that even more complex 
reactions may be envisaged in which multiple first 
nucleic acid molecules (e.g. 2 or more, e.g. 2-10) are 
5 simultaneously cleaved in the same reaction and their 
fragments bound to appropriate adapters which direct 
them to bind to different second nucleic acid molecules, 
e.g. different vectors or sites in vectors. 

Whilst the above described methods describe an 

10 especially simplified method, the above described 

effects may also be achieved by performing the method in 
discrete steps. This is particularly appropriate where 
different enzymes are used which would produce 
undesirable products in other molecules. Thus for 

15 example, different nuclease, such as restriction enzymes 
may be used to cleave the first and second nucleic acid 
molecules. In such cases, the molecules are cleaved 
separately, whereafter the enzymes are removed or 
inactivated before the fragments are mixed together with 

20 the adapters. Similarly, even if the same enzyme is 
used, if the adapters contain enzyme sensitive sites, 
the adapters could be appropriately modified to avoid 
reaction, e.g. by methylation, or the enzymes used to 
fragment the first and/or second nucleic acid molecules 
' 25 would be inactivated or removed (as mentioned above) 
prior to the addition of the adapters. 

Conveniently, inactivation of enzymes may be 
achieved by incubation at at least 65°C, e.g. for 20 
minutes. Alternatively, appropriate techniques 

3 0 employing removal of the enzymes from the reaction, use 
of chelators, inhibitors etc. may be used to achieve 
inactivation. 

Once appropriate clones have been generated and 
selected these may be treated according to standard 

35 methods of amplification, transformation, replication, 
expression, sequencing, depending on the proposed 
application of the clones. Other aspects of the 
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invention thus include the nucleic acid molecule product 
of the method (ie. the nucleic acid molecule that is the 
[first nucleic acid molecule fragment] : [adapter] : [second 
nucleic acid molecule] product), such as cloning and 
expression vectors comprising that nucleic acid molecule 
product as well as transformed or transfected 
prokaryotic or eukaryotic host cells, or transgenic 
organisms containing a nucleic acid molecule produced 
according to the method of the invention. 

Appropriate expression vectors include appropriate 
control sequences such as for example translational 
(e.g. start and stop condon, ribosomal binding sites) 
and transcriptional control elements (e.g. promoter- 
operator regions, termination stop sequences) linked in 
matching reading frame with the nucleic acid molecules 
of the invention. Appropriate expression systems are 
well known and documented in the art as well as methods 
for their introduction and expression in prokaryotic or 
eukaryotic cells or germ line or somatic cells to form 
transgenic animals. Appropriate expression vectors for 
transformation include bacteriophages and viruses, such 
as baculovirus , adenovirus and vaccinia viruses. 

Kits for performing the methods described herein 
form a preferred aspect of the invention. Thus viewed 
from a further aspect the present invention provides a 
kit for attaching a first nucleic acid molecule fragment 
to a second nucleic acid molecule or a fragment thereof 
comprising at least (i) one or more adapters as 
described hereinbefore or means for producing such 
adapters, (ii) the second nucleic acid molecule and 
(iii) a nuclease which cleaves outside its recognition 
site, wherein the terminus of one of said adapters has a 
single stranded region complementary to a single 
stranded region generated on said second nucleic acid 
molecule after cleavage with said nuclease. 

Preferably said kit comprises a library of 
oligonucleotides, e.g. as described herein, particularly 
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as described in Example 3, from which appropriate 
adapters may be generated. The library of 
oligonucleotides as described herein forms a further 
preferred feature of the invention. Thus for example 
said library may comprise a plurality of 
oligonucleotides comprising 1) a plurality of 
oligonucleotides of the formula XNNNNN wherein X is one 
or more bases (wherein said bases are as described 
hereinbefore) and is invariant in all of said 
oligonucleotides and each N is a base at the 5' end 
which is varied in the different oligonucleotides, ie. 
to produce 1024 variants, 2) a plurality of 
oligonucleotides of the formula X 1 NNNN wherein X 1 is 
complementary to X and is invariant in all of said 
oligonucleotides and each N is a base at the 5' end as 
described hereinbefore, 3) a plurality of 
oligonucleotides of the formula YNNNNN wherein Y, which 
is not the same as X, is one or more bases (wherein said 
bases are as described hereinbefore) and is invariant in 
all of said oligonucleotides and each N is a base at the 
3' end as described hereinbefore, and 4) a plurality of 
oligonucleotides of the formula Y ' NNNNNN wherein Y* is 
complementary to Y and is invariant in all of said 
oligonucleotides and each N is a base at the 3' as 
described hereinbefore. 

Optionally the kit may contain other appropriate 
components selected from the list including ligases, 
enzymes necessary for inactivation and activation of 
restriction or ligation sites, primers for amplification 
and/or appropriate enzymes, buffers and solutions, and a 
data carrier containing a computer program to assist in 
the selection of oligonucleotides from the above 
mentioned library. The use of such kits for performing 
the method of the invention form further aspects of the 
invention . 

The above described method may be adapted to 
combine multiple first, second, third etc. nucleic acid 
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molecules as described below. In this method multiple 
fragments are combined by appropriate selection of the 
single stranded regions which appear at their ends. 
This has application in the production of specific 
5 sequences for biological purposes, but has particular 
utility in the production of nucleic acid molecule 
chains in which the units making up the chains each 
denotes a unit of information, ie . the chains may be 
used to store information, as will be described in more 

10 detail below. As used herein "chain" refers to a serial 
arrangement of' fragments as described herein. Such 
chains are preferably linear and include branched and 
unbranched fragment sequences. Thus, for example, 
branched DNA fragments may be used to provide chains 

15 with a branched arrangement of fragments. 

To produce nucleic acid molecule chains with 
different unit fragments, ie . fragment chains the 
following method may be used. Firstly it is necessary 
to generate fragments which have overhangs at either 

20 end, to allow them to bind to one another. (The 
ultimate 3' and 5' fragments may however have an 
overhang at only the end which will become attached to 
internal fragments.) As will be described in more 
details below, for certain applications appropriate 

25 oligonucleotides may be derived from libraries in which 
the members exhibit variability in at least some of 
their bases. If libraries are to be produced in which 
the members are double stranded, it will be appreciated 
that the number of members in such a library could be 

30 rather high. This can however effectively be reduced by 
using a smaller number of smaller building blocks. 

One strategy is to make two single -stranded 
oligonucleotides using conventional techniques. In the 
example described above (6 base double stranded linker 

35 and 3 base overhangs at either end) , oligonucleotides 
having a region of 6 bases which complement each other 
and so allow hybridization may be used. Since not all 
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of Che molecules are involved in the hybridization, 
single stranded regions extend beyond the hybridizing 
region thus creating single stranded regions. 
Conveniently the number of required library members may 
be reduced even further if repeat sequences appear with 
frequency in the fragment chain. This will be described 
in more detail below. 

Once the appropriate double stranded chain units 
(ie. fragments) have been created they may be ligated 
together in the same solution, providing the different 
overhangs present' on the sequences are unique. 

Thus in a further aspect, the present invention 
provides a method of synthesizing a double stranded 
nucleic acid molecule comprising at least the steps of: 

1) generating n double stranded nucleic acid 
fragments, wherein at least n-2 fragments have single 
stranded regions at both termini and 2 fragments have 
single stranded regions at at least one terminus, 
wherein (n-1) single stranded regions are complementary 
to (n-1) other single stranded regions, thereby 
producing (n-1) complementary pairs, 

2) contacting said n double stranded nucleic acid 
fragments, simultaneously or consecutively, to effect 
binding of said complementary pairs of single stranded 
regions , and 

3) optionally ligating said complementary pairs 
simultaneously or consecutively to produce a nucleic 
acid molecule consisting of n fragments. 

The terms "nucleic acid molecule", "single stranded 
regions", "complementary", "binding" and "ligating" are 
as described hereinbefore. 

In step 1) reference is made to (n-1) single 
stranded regions complementary to (n-1) "other" single 
stranded regions. This describes two families of single 
stranded regions, which together comprise 2 (n-1) 
members, forming n-1 pairs. Thus "other" refers to 
single stranded regions in the second family which are 
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not present in the first family. 

"Contacting" as used herein refers to bring 
together the double stranded fragments under conditions 
which are conducive to association of the complementary 
5 single stranded regions. Depending on the method used, 
this may ultimately allow ligation of the fragments 
carrying those regions. It should however be noted that 
the fragments may be linked by methods other than 
ligation. For example PCR may be used with appropriate 

10 primers, e.g. pairs of primers. 

Simultaneous or consecutive contacting and/or 
ligation refers to the possibility of adding the 
fragments individually or in groups to a growing chain 
or simultaneously adding all n fragments together, 

15 wherein ligation may be performed after each addition or 
once all n fragments have been combined. Preferably 
ligation is effected once all fragments have been 
combined . 

"Fragments" as used herein are as defined herein 
20 before, but preferably are shorter in length. Thus 

fragments are preferably greater than 6 bases in length 
(wherein said length refers to the length of each single 
stranded oligonucleotide making up the fragment which 
may differ slightly in length from one another) , e.g. 
25 between 6 and 50 bases, e.g. from 8 to 25 bases. 

As referred to herein, "n" is an integer of at 
least 4, for example at least 10 or 100, e.g. between 25 
and 2 0 0. 

Preferably, as mentioned above, the fragments are 
30 generated by the use of single stranded oligonucleotides 
to generate appropriate double stranded molecules. 

Of particular interest in such methods is the 
production of fragment chains that may be used to store 
information in the form of code which may readily be 
35 accessed. 

There is currently a great need for storing 
information for different purposes (e.g. computer 
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software, music, films, databases etc.). It has 
therefore been imperative to find efficient storage 
media, resulting in the development of CD ROMs, DVD 
technology etc. Nucleic acid molecules offer far more 
efficient methods for storing information and have 
several advantages over storage methods currently in 
use. For example, the storage capacity of nucleic acid 
molecules is vast. In principle, a test-tube containing 
DNA molecules may contain as much information as several 
million CD ROMs or more. Nucleic acid may be copied 
quickly and efficiently using natural systems which are 
greatly enhanced by techniques which have been developed 
such as PCR, LCR etc. When stored appropriately, 
nucleic acid molecules may be preserved for extremely 
lengthy periods. Naturally existing tools for 
manipulation of nucleic molecules are already available 
for processing of the molecules, e.g. polymerases, 
restriction enzymes, transcription factors , ribosomes 
etc. The nucleic acid molecules may also have catalytic 
properties . 

Furthermore, nucleic acid molecules may be used as 
secure systems since they may be made such that they are 
not readily copied, unlike copying of current storage 
systems, e.g. CDs etc., which is increasingly prevalent. 

Previously however, it was not possible to take 
advantage of the enormous potential offered by nucleic 
acid molecules due to the absence of any effective 
methods for writing DNA messages or reading DNA 
messages . The above described method provides methods 
which overcome this problem allowing the rapid synthesis 
of large DNA molecules and methods of rapidly and 
efficiently scanning those molecules to retrieve the 
information . 

The key to effective retrieve of information 
encoded by the nucleic acid molecules produced according 
to the method described herein, is the expansion of the 
information providing unit in the molecule. In nature 
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and in methods used previously, each base in the 
sequence has an individual informational content. 
Indeed methods have been described in which a single 
base may signify more than a single informational unit, 
5 e.g in binary code, the bases A="00", C= 11 01 11 , G="10" and 
T="ll" . Whilst this has advantages insofar as 
significant amounts of information can be contained in a 
single molecule, the system has serious drawbacks as it 
requires writing and reading methods in which individual 

10 bases may be attached and discriminated. 

In a preferred method of the invention therefore, 
information units are provided which are not single 
bases, but are instead short sequences. The techniques 
described above allow the rapid production of such 

15 chains and the information may be readily accessed. 

Thus units representing coded information may be 
generated and read. Each information unit may therefore 
represent an element of code, in which the code may for 
example be alphanumeric code or a simpler representation 

20 such as binary code. In each case it is necessary for 
individual elements of the code, e.g. "a", "b" , "c", 
"1", "0" etc. to be represented by an individualized and 
specific sequence . 

As used herein "information units" refer to 

25 discrete short sequences which represent a single piece 
of information, e.g. one or more (ie. combinations 
thereof) elements of a code. 

"Elements" of code, as mentioned above, refer to 
the different members making up a code such as binary or 

3 0 alphanumeric code. 

Thus, in a preferred embodiment of the method of 
the invention, the fragments which are linked together 
comprise regions representing a unit of information 
corresponding to one or more code elements . Preferably 

35 the code is alphanumeric. Especially preferably the 

code is binary. Thus for example, considering a binary 
system of information capture, if one wishes to produce 
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chains consisting of "0", "1" fragments, appropriate 
sequence combinations may be attributed to " 0 " or 11 1 " . 

Conveniently each of said one or more code elements 
(together) has the formula 

(X) af 
wherein 

X is a nucleotide A, T, G, C or a derivative 
thereof which allows complementary binding and may be 
the same or different at each position, and 

a is an integer greater than 2, e.g. greater than 
4, for example" from 2 to 20, preferably from 4 to 10, 
e.g. 6 to 8, 

wherein (X) a is different for each one or more code 
"elements . 

Especially preferably, in the case of binary code, 
the code elements "1" and "0" may have the formulae: 

"0"= (X) a and "1"= (Y) b , 
wherein 

(X) a and (Y) b are not identical, 

X and Y are each a nucleotide A, T, G, C or a 
derivative thereof which allows complement airy binding 
and may be the same or different at each position, and 

a and b are integers greater than 2, e.g. greater 
than 4, for example from 2 to 20, preferably from 4 to 
10 , e.g. 6 to 8 . 

As referred to herein, a "derivative" which is 
capable of complementary binding refers to a nucleotide 
analog or variant which is capable of binding to a 
nucleotide present in a complementary strand, and 
includes in particular naturally occurring or synthetic 
variants of nucleotides, e.g uracil or methylated, 
amidated nucleotides etc. 

In its simplest and preferred form, X and Y are the 
same at each position, e.g. "0"= GGGGGGGG and 
" 1 " =AAAAAAAA . However, repeat sequences such as [AC] 6 A 
or [GT] 6 A may be used. The code sequence may also have a 
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functional property, e.g. it may be an integration 
element such as AttPl or AttP2. 

It will however be appreciated that the sequences 
described above may also denote more than a single code 
element. Thus for example the information unit may 
denote 2 or more code elements, e.g. from 2 to 32 
element, preferably from 2 to 4 code elements. If for 
example binary code is considered, each information unit 
may refer to "01" or "00" or "11" or "10". 

In the method, described herein, chains comprising 
such features may be prepared as follows. To produce a 
chain with for example 8 0/1 fragments, eight "0" 
starting fragments with different overhangs and 8 11 1 " 
starting fragments with different overhangs are 
generated as illustrated in Figure 2. In this case "0" 
fragments consist of the sequence GGGGGGGG, although 
this could be replaced by other sequences. In addition 
the fragments are synthesized such that they have unique 
overhangs such that they may only be ligated at one 
position. Thus, the fragments for position 1 in the 
chain are produced such that they have an overhang which 
is complemented by one of the overhangs in the fragments 
for position 2. Thus, the position 2 fragments are 
synthesized such that they can bind to position 1 
fragments. Similarly position 3 fragments may only bind 
to position 2 fragments at one of their termini and 
position 4 fragments at the other terminus and so forth. 
These fragments are stored separately. In order to build 
up a chain, selection is made from one of the two 
alternative for each position such that an appropriate 
binary chain is produced. 

Thus, in the scheme outlined above, to produce a 
fragment chain which represents a chain 01001011, "0" 
fragments from positions 1, 3, 4 and 6 are mixed with 
"1" fragments from positions 2, 5, 7 and 8. If the 
fragments are then ligated together by adding ligase or 
using other ligation methods mentioned previously, the 
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above described chain will be produced. As will be 
appreciated, this chain could also be achieved using for 
example only 4 fragments if the information unit carried 
on each fragment denoted 2 code elements. 

It is furthermore possible to combine intermediate 
fragment chains (e.g. containing at least 4 fragments) 
with other fragment chains, which providing appropriate 
overhangs exist at their termini may be ligated together 
to form composite fragment chains. Thus, several cycles 
could be conducted in parallel and the products 
combined. In the method shown in Figure 2, the end 
fragments have blunt ends, but clearly, appropriate 
fragments could be used that similarly have overhangs at 
the termini . 

An appropriate technique for producing 8 fragment 
chains, each containing 8 fragments which can then be 
ligated together is illustrated in Figure 3 ; For 
fragment chain 1, end fragments are used such that it is 
possible for the completed fragment chain to ligate to 
fragment chain 2 and so on. These may then be combined 
to produce a 64 fragment chain. Similarly, 8 such 
fragment chains may be combined to produce fragment 
chains comprising 512 fragments. 

As will be appreciated, as with the production of 
shorter chains, the step of ligation, when performed, is 
conveniently effected once all the fragment chains have 
been combined. However, the step of ligation may be 
performed sequentially if desired on addition of each 
subsequent fragment chain. 

To combine 8 binary fragments per cycle, 16 
different starting fragments are required, representing 
the different "0", "1" alternatives at each position. 
To make a chain of 64 fragments using two cycles, ie . to 
produce 8 chains with 8 fragments which are then 
ligated, only 16+ (4x7) =44 starting fragments are 
required. Thus, the number of different starting 
fragments required reflects an almost linear increase in 
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contrast to the combinations of the fragment chains 
which can be produced which increases exponentially with 
the number of cycles. As a consequence, very long 
fragment chains may be produced with a relatively small 
5 number of starting fragments. 

Of course, as mentioned previously, intermediate 
chains longer or shorter than 8 may be produced. Since 
a large number of permutations exist in the overhang 
region, more starting fragments may be used thus 

10 allowing larger fragments to be built up in a single 

cycle. Thus, the number of cycles necessary to produce 
long chains may be reduced. 

Small fragment chains produced according to the 
methods described herein may also be attached together 

15 by using variations of the techniques described herein. 
For example, complementary primer pairs may be used to 
link the various chains as described in Example 8 . In 
this technique, amplification of the fragment chains is 
achieved using different primer pairs. The second 

20 primer in primer pair 1 is complementary to the first 
primer in primer pair 2 and the second primer in that 
pair is complementary to the first primer in primer pair 
3 and so on. PCR reactions are then performed which 
produce products which in single stranded form are able 

25 to bind to one another through their complementary ends 
introduced by the primer pairs. These may then be 
ligated together. 

Alternatively, fragment chains prepared by the 
methods described herein may be amplified with a primer 

30 which contains a restriction site to a nuclease which 
cleaves outside its recognition site. These 
amplification products are then digested with that 
nuclease to produce non-palindromic overhangs in the end 
of each fragment chain. By appropriate sequence 

35 selection (e.g. in the primer or fragments which are 
used) the overhangs which are generated allow the 
different fragment chains to be combined in order. 
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In a preferred aspect therefore, the invention 
provides a method of synthesizing a double stranded 
nucleic acid molecule comprising at least the steps of : 

1) generating fragment chains according to the method 
5 described hereinbefore; 

2) optionally generating single stranded regions at 
the end of said fragment chains, wherein said single 
stranded regions are complementary to other single 
stranded regions on said fragment chains thus forming 

10 complementary pairs of single stranded regions ; 

3) contacting said fragment chains with one another, 
simultaneously or consecutively, to effect binding of 
said complementary pairs of single stranded regions. 

Optionally said chains are ligated together, 
15 however, alternative techniques may be use to form the 
ultimate chain, e.g. PCR may be used as described 
herein. 

Preferably intermediate fragment chains are between 
4 and 20 fragments in length, e.g. 5 to 10, and between 
20 5 and 50 such fragment chains are combined e.g. between 
10 and 20. 

Conveniently fragments to be used in the method of 
the invention are contained within libraries. Methods 
of producing the fragments which make up the library are 

25 well known in the art. For example a series of 

oligonucleotides may be produced which comprise two 
portions. A first portion which will form an overhang 
at one end and a second portion which will effect 
binding to a complementary oligonucleotide and which 

30 contains within that portion the information unit. By 
producing common hybridizing portions and variant 
overhangs, a series of double stranded oligonucleotides 
for one or more code elements (denoted by at least a 
part of the hybridizing portion) are created. This 

3 5 provides a library for one (or a combination of) code 
elements. Different libraries may be created for 
different code elements (or combinations thereof) , by 
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appropriate alteration of the information unit, ie. the 
sequence in the hybridizing portion. 

Conveniently for use in the invention, these 
different double stranded oligonucleotides are arranged 
in 2 dimensional arrays such that in one dimension 
consecutive positions within the ultimate fragment are 
indicated and in the second dimension the possible code 
element (or combinations thereof) are provided. In the 
simplest case, in binary code, in which "0" and "1" are 
represented by different sequences, the first dimension 
would comprise fragments for each position of the 
proposed fragment and the second dimension would have 
only 2 variants ("0" and "1") . This may be viewed as a 
single library or two libraries, ie . the " 0 " or " 1" 
libraries. Once these libraries are produced, fragment 
chains with any desired order of fragments may be 
readily produced. 

In order to appropriately direct library members to 
their correct site or well (ie. the library may be 
comprised of separate solid supports, or a solid support 
with different addresses, e.g. wells, or different wells 
containing different solutions) , any appropriate sorting 
technique may be used. This sorting may be achieved by 
virtue of the process used for production of the library 
members, or sorting may be achieved by an appropriate 
technique, e.g. by binding to complementary 
oligonucleotides at the relevant library site. 

Appropriate solid supports suitable for attaching 
library members are well known in the art and widely 
described in the literature and generally speaking, the 
solid support may be any of the well-known supports or 
matrices which are currently widely used or proposed for 
immobilization, separation etc. in chemical or 
biochemical procedures. Thus for example, the 
immobilizing moieties may take the form of beads, 
particles, sheets, gels, filters, membranes, microfibre 
strips, tubes or plates, fibres or capillaries, made for 
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example of a polymeric material e.g. agarose, cellulose, 
alginate, teflon, latex or polystyrene. Particulate 
materials, e.g. beads, are generally preferred. 
Conveniently, the immobilizing moiety may comprise 
5 magnetic particles, such as superparamagnetic particles. 

In a preferred embodiment, plates or sheets are 
used to allow fixation of molecules in linear 
arrangement . The plates may also comprise walls 
perpendicular to the plate on which molecules may be 

10 attached. Attachment to the solid support may be 

performed directly or indirectly and the technique which 
is used will depend on whether the molecule to be 
attached is an oligonucleotide for fixing the library 
member or the library member itself. For attaching the 

15 library members directly, ie . not via binding to an 
oligonucleotide, conveniently attachment may be 
performed indirectly by the use of an attachment moiety 
carried on the nucleic acid molecules and/or solid 
support. Thus for example, a pair of affinity binding 

20 partners may be used, such as avidin, streptavidin or 

biotin, DNA or DMA binding protein (e.g. either the lac 
I repressor protein or the lac operator sequence to 
which it binds) , antibodies (which may be mono- or 
polyclonal) , antibody fragments or the epitopes or 

25 haptens of antibodies. In these cases, one partner of 
the binding pair is attached to (or is inherently part 
of) the solid support and the other partner is attached 
to (or is inherently part of) the nucleic acid 
molecules. Alternatively, techniques of direct 

30 attachment may be used such as for example if a filter 
is used, attachment may be performed by UV- induced 
crosslinking . When attaching DNA fragments, the natural 
propensity of DNA to adhere to glass may also be used. 
Oligonucleotides to be used for capture of the 

35 library members may be attached to the solid support via 
the use of appropriate functional groups on the solid 
support . 
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Attachment of appropriate functional groups to the 
solid support may be performed by methods well known in 
the art, which include for example, attachment through 
hydroxyl, carboxyl , aldehyde or amino groups which may 
5 be provided by treating the solid support to provide 
suitable surface coatings. Attachment of appropriate 
functional groups to the nucleic acid molecules of the 
invention may be performed by ligation or introduced 
during synthesis or amplification, for example using 

10 primers carrying^ an appropriate moiety, such as biotin 
or a particular sequence for capture. 

In a further aspect therefore the present invention 
provides a library of fragments as defined herein 
comprising (n) ra fragments, wherein n is as defined 

15 hereinbefore and corresponds to the length of chain that 
said library may produce, and m is an integer 
corresponding to the number of possible code elements or 
combinations thereof, such that fragments corresponding 
to all possible code elements for each position in the 

20 final chain are provided. 

Portions of said libraries in one dimension, ie . 
comprising n fragments for only a single code element 
(or combinations thereof) or comprising m fragments 
representing all code elements (or combinations thereof) 

25 for a single position on the chain, form further aspects 
of the invention. 

Appropriate mixing may be achieved by automation. 
For example in the case of "0", "1" fragments, the 
correct combination of these elements is the critical 

30 step in terms of resource- and time-consumption. This 
method is described in more detail in Example 2. In 
particular, the procedure may be miniaturised providing 
appropriate amplifying methods (such as cloning and/or 
PCR) are employed in the last step. Thus, techniques 

35 using technology such as sorting using flow cytometers 

may be employed as described in Figure 4C. Such sorting 
procedures are well established and are able to sort 
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approximately 5-30000 droplets per second for standard 
equipment, but up to 300000 droplets per second for the 
most advance cytometers . 

As mentioned previously, it is possible that each 
5 fragment may denote more than a single code element. If 

for example, each fragment denotes 5 code elements, 
using existing technology and a library of 32x100 
library components, if 3200 containers were connected to 
a sorting device illustrated in Figure 4C, it should be 
10 possible to write several thousand chains with 500 code 
elements per second. Clearly, a method which can 
generate nucleic acid sequences with such rapidity 
offers significant advantages over known methods in the 
. art . 

15 The nucleic acid molecule (ie. the fragment chain) 

produced according to the above described method and the 
single stranded molecules thereof comprise further 
features of the invention. These molecules may as 
appropriate be included into a vector, as described 

20 hereinbefore. 

Once produced, the fragment chains, in double 
stranded or single stranded form, may be used in various 
applications, as described hereinafter. One application 
of particular utility is to store information. In such 

25 cases appropriate means of reading the information 
stored in those chains is required. In some 
applications, fragment chains may be appropriately 
addressed to particular sites, e.g. through binding to 
oligonucleotides carried on solid supports which are 

3 0 complementary to overhangs on one terminus of the 
fragment chains. Alternatively appropriate 
antibody/antigen, or DNA: protein recognition systems may 
be used. Thus, information stored in molecules 
addressed in this way, or in solution may then be 

35 accessed. 

Co-pending application PCT/GB99/04417 , a copy of 
which is appended hereto, describes appropriate 
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techniques for addressing and reading information 
contained in nucleic acid molecules. Of particular note 
in this respect are techniques in which fluorescence of 
probes carrying fluorescent labels directed to 
5 particular sequences are detected. In such techniques, 
probes, carrying labels as described hereinbefore, may 
be directed to particular fragment regions, particularly 
to regions denoting code elements. The signals 
generated (directly or indirectly) by those labels may 

10 then be detected and the code element thereby 

identified. If a simple binary system is used only 2 
discrete labels are required and their pattern of 
binding may be determined. Alternatively, if a more 
complex code is reflected in the fragment chains, 

15 correspondingly more discrete labels are required for 
unambiguous detection . 

Thus in a further aspect, the present invention 
provides, a method of identifying the code elements 
contained in a nucleic acid molecule prepared as 

20 described hereinbefore (ie. fragment chain) wherein a 
probe, carrying a signalling means (e.g. a label) , 
specific to one or more code elements, is bound to said 
nucleic acid molecule and a signal generated by said 
signalling means is detected, whereby said one or more 
*' 25 code elements may be identified. 

Preferably said signalling means is a label as 
described hereinbefore . 

A "probe" as referred to herein refers to an 
appropriate nucleic acid molecule, e.g. made up of DNA, 

3 0 RNA or PNA sequences, or hybrids thereof, which is able 
to bind to the target nucleic acid molecule (which may 
be single or double stranded) through specific 
interactions, ie. is specific to particular code 
elements, e.g. through complementary binding to a 

35 particular sequence. Probes may be any convenient 

length, to allow specific binding, e.g. in the order of 
5 to 50 bases, preferably 8 to 20 bases in length. 
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A "signalling means" as used herein refers to a 
means for generating a signal directly or indirectly. A 
signal may be any physical or chemical property which 
may be detected, e.g. presence of a particular product, 
colour, fluorescence, radiation, magnetism, 
paramagnetism, electric charge, size, or volume. 
Preferably the label is a fluorophore whose florescence 
is detected. In such cases fluorescence scanners may be 
used for detection of the label and thereby 
identification of the code elements. 

A particular code element or combination of 
elements may be identified by the appearance of a - 
particular signal. Clearly the position of each signal 
is crucial to determining the sequence of the code 
elements. As a consequence methods in which positional 
information (absolute or relative) may be obtained 
should be used. Appropriate techniques, e.g. using 
target molecules which have been attached to a solid 
support at one end, are described in co-pending 
application PCT/GB99/04417 . 

A number of applications exist for the fragment 
chains once produced in nano and pico- technology , inter 
alia for example by stretching of the fragment chains by 
means of a stream of liquid, electricity or other 
technology and using them as templates for nano and 
pico-structures . The products may also be used to label 
products which can then be screened to establish their 
identity. Alternatively, the molecules may be used to 
store information, e.g. pictures, text, music or as data 
storage in DNA computers. The rapid production and 
reading techniques makes such applications possible for 
the first time. 

Kits for performing the methods described above 
form a preferred aspect of the invention. Thus viewed 
from a further aspect the present invention provides a 
kit for synthesizing a double stranded nucleic acid 
molecule comprising at least n double stranded nucleic 
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acid fragments, wherein at least n-2 fragments have 
single stranded regions at both termini and 2 fragments 
have single stranded regions at at least one terminus, 
wherein (n-l) single stranded regions are complementary 
5 to (n-l) other single stranded regions, thereby- 
producing (n-l) complementary pairs. Preferably in 
excess of n fragments are supplied for production of a 
chain of n fragments, such that selection of appropriate 
fragments for different positions is possible . Thus in 

10 a preferred feature said kit comprises (n) m fragments, 
wherein n is as defined hereinbefore, and m is an 
integer corresponding to the number of possible 
variations, e.g. unique sequences or code elements or 
combinations thereof, such that fragments corresponding 

15 to all possible sequences or code elements for each 

position in the final chain are provided. Preferably 
these fragments are provided in appropriate libraries 
arranged with reference to their position within the 
fragment chain and the code element (s) which they 

20 represent, such that desired fragments may be readily 
selected from the array. 

Optionally the kit may contain other appropriate 
components selected from the list including ligases, 
enzymes necessary for inactivation and activation of 

25 restriction or ligation sites, primers for amplification 
and/or appropriate enzymes, buffers and solutions. The 
use of such kits for performing the method of the 
invention form further aspects of the invention. 

3 0 The following examples are given by way of illustration 
only in which the Figures referred to are as follows: 

Figure 1 shows a schematic representation of how the 
method of the invention may be used to introduce an 
35 insert into a vector, in which the insert is cleaved 
from the first nucleic acid molecule, associated with 
adapters and ligated thereto and then ligated into the 
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Figure 2 shows the production of a fragment chain using 
8 "0" and "1" starting fragments with different 
overhangs ; 

5 Figure 3 shows the production of a 64 fragment chain in 
which 8 chains are produced comprising 8 fragments each, 
in which the termini of chains 1 and 2, and 2 and 3 etc. 
are complementary such that they may be ligated 
together ; 

10 Figure 4 shows 3 techniques for mixing "0", "1" 

fragments from' a library of fragments ordered for each 
position, in which in A) appropriate fragments are 
selected by aspiration from appropriate wells, B) 
appropriate fragments are released from the library 

15 wells and C) a flow cytometer is used to direct 
appropriate droplets to the mixing chamber ; 
Figure 5 shows PCR amplification of signal chain 
1-0-1-0-0 using SP6 and T7 primers. Lane 1: 1 fig of 1 kb 
DNA ladder (Gibco BRL) , Lane 2: 10 /il of PCR amplified 

20 fragment chain DNA using SP6 and T7 primers. Lane 3: 
Same as lane 2 except for the use of SP6 and T7-Cy5 
primers ; and 

Figure 6 shows the use of primer pairs during the 
process of amplification to join together fragment 
- 25 chains. 
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EXAMPLE 1 : CL ONING OF AN INSERT INTO A VECTOR . FOP 
EXAMPLE FROM PHIX174 TNTQ PUC1 9 

A general procedure to be followed using lis and IP 
enzymes to achieve cloning involves the use of a cloning 
vector which has the following characteristics: 

1) A multiple cloning site located within a gene 
(lacZ, ccdB or other) that allows the detection of 
successful insertion . 

2) The multiple cloning site contains two flanking 
Hgral sites that generates overhangs that differ from 
other Hgral generated overhangs elsewhere in the vector. 
The orientation of the Hgral sites ensures excision of 
its sites from the vector part during digestion. To 
minimize background due to undigested plasmids, several 
Hgral sites and other suitable restriction enzyme sites 
are included in the MCS . The restriction enzymes are 
chosen such that they cleave well in Hgral buffer and do 
not have other sites in the vector. 

The donor plasmid is cut with the appropriate set of IIS 
and/or IP enzymes. Adapters are used to specify the 
fragment to be sub-cloned into the vector, by the use of 
appropriate single stranded regions on the adapters to 
the overhangs generated on the insert. This results in 
the molecule: vector - adapter 1 - insert (e.g. PhiX174 
gene) - adapter II - vector. 

This method is illustrated for insertion of a PhiX174 
insert into a vector, e.g. pUC19. An Hgal site in a 
pUC19 plasmid is chosen randomly to be our "polylinker " 
while different genes and gene combinations from the 
PhiX174 genome is used as "inserts". 

Genomes are organized in PhiX174 as illustrated below 
which shows the position of genes A, B , C and E relative 
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to one another: 
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In the above, gene B is located inside gene A while gene 
10 C is slightly overlapping with gene A (by 3 base pairs) . 
Gene D and K are located in the same area as gene C and 
E, but are not shown. This genome area contains 9 BJbvI 
sites as shown on the bottom row, in which the overhang 
pairs that will be generated by cutting with Bbvl are as 
15 follows with the base pair position indicated in 

brackets: 1-CAGC/GTCG (3798), 2-CTGC/GACG (4215), 3- 
ACGG/TGCC (4398), 3-GCAT/CGTA (4677), 5-CTAT/GATA 
(5049), 6-GAGA/CTCT (158), 7-GAGC/CTCG (547), 8- 
CAAC/GTTG (624), 9 - CCAT/GGTA (892). The parts of the 
20 PhiX174 genome not shown contain 5 more Bbvl sites: 10- 
TACC/ATGG (1488), 11-TACC/ATGG (1592), 12-CTAC/GATG 
(1639), 13-GCAC/CGTG (3294), 14 - CTAA/GATT (3297). Of 
these only 12 give rise to non- identical overhangs 
whilst 2 result in identical overhangs. 

25 

When Hga.1 is used to cleave pUC19, 4 non-identical sites 
are cleaved, giving rise to 8 non- identical overhangs. 
These are: 1-CTGCC/GACGG (573), 2 - TTCTC/AAGAG (1131), 3- 
CAAGG/GTTCC (1881) , 4 -AGACT/TCTGA (245 9) . 

30 

Method: 

To sub-clone gene B from Bacteriophage PhiX174 into the 
designed vector, the following protocol is used; 

35 1) 2/ig of PhiX174 DNA is digested with 2 U of SJbvI (NEB) 

in IX buffer 2 (NEB) , water added to a volume of 20/^1, 
for 1 hr at 37°C. SJbvI is then heat inactivated at 65°C 
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for 20 minutes. 

2) 2/ig of vector (e.g. pUC19) is digested with 2 U Hgal 
(NEB) in IX buffer 1 (NEB) , water added to a volume of 
20/xl, for 1 hr at 37°C. Hgal is then heat inactivated at 
65°C for 20 minutes. 

3) The adapters are made in separate tubes by mixing 
two and two oligonucleotides (selected to obtain the 
desired product, ie . particular gene(s), in 
forward/reverse orientation) and allowing annealing. 

4) 6/xl of the cleavage reaction of PhiX174 is mixed 
with 2fil of the cleavage reaction of the vector and 
ligated in the presence of 5-50 pmol of each adaptor, 
2-10 U//xl T4 DNA Ligase (NEB) , IX ligase buffer (NEB) 
and 5% Polyethylene glycol 8000, water added to a volume 
of 30/zl, at 25°C for 1 hr . 

5) Conventional methods are used to transform bacteria. 

6) The colonies are then counted and some of them are 
then picked for further analysis (sequencing, and the 
like) . 

Materials : 

Oligonucleotides used to address PhiX174 overhangs: 
Bbvl overhang la : 

5 1 - CGA GCG CCT CCA GTG CAG CGG AG 
Bbvl overhang 5a: 

5 " - TATC GCG CCT CCA GTG CAG CGG AG 
Bbvl overhang 6b: 

5 ' - CTCT GCG CCT CCA GTG CAG CGG AG 
Bbvl overhang 6 (delC) : 

5 1 - CTCT CTC CGC TGC ACT GGA GGC GC 
Bbvl overhang 7a: 

5 1 - CAAC GCG CCT CCA GTG CAG CGG AG 
Bbvl overhang 9b : 

5 1 - GGTA GCG CCT CCA GTG CAG CGG AG 
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Oligonucleotides used to address pUC19 overhangs: 
Cloning site la 

5 ' - AAGAG CTC CGC TGC ACT GGA GGC GC 
Cloning site lb 
5 5 1 - CTCTT CTC CGC TGC ACT GGA GGC GC 

Two important advantages with this recombination-method 
over the classical Cohen-Boyer method should be noted. 
The procedure is very easy to perform. It involves only 
10 mixing and incubation steps before transformation. No 
PCR-amplif icat'ions or gel separations are required. 
The methods gives significant flexibility and allows 
complex recombinations to be made even with only two 
restriction enzymes. 

15 

EXAMPLE 2: AUTOMATIO N AND MINIATURISATION OF CHAIN 
SYNTHESIS 

This method describes a rapid process for mixing 
2 0 appropriate "0" and "1" fragments with the correct 

overhangs to produce a particular string consisting of 
" 0 " ' s and " 1 " " s . 

Two libraries are produced, one with " 0 n fragments and 
25 one with "1" fragments. As mentioned in the 

description, these are generated with overhangs that can 
be ligated to corresponding overhangs for fragments at 
adjacent positions. These separate members are present 
in separate wells to form the library, such that 
30 position 1 fragments are present in well 1, position 2 
fragments are present in well 2 and so forth. The two 
libraries thus provide the alternatives for each 
position. In order to generate the chain therefore it 
is only necessary to select the correct fragment "0" or 
35 "1" for position 1, and then position 2 etc. Since 
these fragments, as a consequence of their unique 
overhangs, may only hybridize to fragments for adjacent 
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positions, it is necessary only to select the correct 
fragments, then mix and ligate those fragments 
simultaneously. Different ways of achieving this effect 
are shown in Figure 4 which shows three different 
alternatives for mixing. 

In Figure 4A, e.g. to produce the chain 0-1-0-0-1, the 
apparatus is used to aspirate from the "0" library at 
positions 1, 3 and 4, and aspirate from the "1" library 
at position 2 and 5. The liquids that have been 
aspirated may then be mixed together with ligase and an 
appropriate buffer. In alternative B, each well in the 
library is connected with a tube/nozzle that may be 
closed/opened electronically. Liquid from the nozzles 
is directed into the ligation chamber together with 
ligase and an appropriate buffer. Different chains may 
be constructed by appropriately changing the pattern of 
nozzles which are opened/closed. 

The procedure may also be miniaturised, e.g. using flow 
cytometry technology as illustrated in Figure 4C. In 
this method, library components are stored in containers 
on top of the "writing-machine" . Droplets from each 
container are then guided either to the waste or 
production well depending on the nature of the chain 
that is to be constructed. The guiding mechanism is as 
used in ordinary flow cytometers, ie . the droplets are 
charged when they leave the container and may be guided 
electronically in different directions. 

EXAMPLE 3 - LIBRARIES COMPRISING OLIGONUCLEOTIDES FOR 
USE IN THE INVENTION 

Conveniently, the cloning method may be performed using 
libraries containing oligonucleotides. For example a 
library may contain: 
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1. Oligonucleotides with a common portion and 5 bases 
at the 5 ' end which vary to provide all possible 
permutations, ie . 1024 variants. 

2. Oligonucleotides with a common portion and 4 bases 
5 at the 5' end which vary to provide all possible 

permutations, ie . 256 variants. 

3 . Oligonucleotides with a common portion and 5 bases 
at the 3 1 end which vary to provide all possible 
permutations, ie . 1024 variants. 
10 4 . Oligonucleotides with a common portion and 6 bases 
at the 3 1 end which vary to provide all possible 
permutations, ie . 4096 variants. 

In the above, the oligonucleotides are produced such 
15 that all "1" oligonucleotides are complementary to "2 11 
oligonucleotides by virtue of the invariant bases, ie . 
to generate a double stranded molecule with variant 4/5 
base overhangs. Similarly "3" and "4" oligonucleotides 
are complementary. 

20 

Oligonucleotides combined in this way (ie. with 
overhangs at either end of 4-6 bases may also be 
combined together with complementary double stranded 
oligonucleotides also generated by combining certain 
25 members of the library. In this way variable overhangs 
of different lengths may be created in the resultant 
molecule, e.g. a molecule with a 4 base overhang at both 
the 3 1 and 5' end. 

30 Oligonucleotides may also be provided in the library 

which allow 5' and 3 • ' adapters to be linked. Thus for 
example oligonucleotides having the following form may 
be provided : 

5. 5 « - AAAA- [ compl ] -FFFFF-3 ' 
35 6. 5 ' -DDDDD- [ compl ] -FFFFF-3 ' 

7. 5 1 -AAAA- [ compl ] - HHHHHH - 3 1 
8 . 5 1 -DDDDD- [ compl ] -HHHHHH- 3 1 
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9 . 3 ' - [ compl* ] -5 ' 

10. 5'-BBBB-[ comp2 ]-3' 

11. 5'-EEEEE-[ comp2* ] -3 ' 

12. 5'-[ comp3 ]-GGGGG-3 t 

13. 5'-[ comp3* ] - IIIIII-3 1 

in which "compx" refer to a region which is 
complementary to region "compx*", ie. "5", "6", "7" or 
"8" can bind to "9". Furthermore, "comp2" can bind to 
oligonucleotide 1 above, "comp2a" can bind to 
oligonucleotide 2, " comp3 " can bind to oligonucleotide 
"4" and "comp3*" can bind to oligonucleotide "3". The 
bases denoted "A" bind to "B", ie . "7" and "10" can bind 
at their ends. Similarly "D" binds to " E " , "F" binds to 
"G" and "H" binds to "I". (These bases when together 
may have a variable content, e.g. AAAA=GAGA and then 
BBBB=TCTC . ) 

By appropriate use of the linkers described above, 5 f 
and 3' adapters may be combined. For example, 
oligonucleotide "2" with a particular 4 base 5 1 overhang 
may be bound through its complementary region to an 
oligonucleotide linker "li" which will then leave a 
"EEEEE" overlap. This may be bound to oligonucleotide 
"8" through the overlap which may itself bind 
oligonucleotide "9" through its complementary region. 
The overlap "HHHHHH" may be bound to oligonucleotide 
"13" which may attach an oligonucleotide "4" through 
binding to the complementary region. Thus various 
permutations may be made which result in various overlap 
lengths, e.g. any combination of 4 , 5 , or 6 base 
overlaps which may on the same or different strands. 

EXAMPLE 4 z TRIMMING PROCEDURE FOR GENERATING UNIQUE 
OVERHANGS 



The system presented here makes it possible to perform a 
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trimming procedure with seven different IIS enzymes that 
make 5 ! 4 base overhangs (Fokl and Bst71I), 5' 5 base 
overhangs (Hgal), 3 1 5 base overhangs (BplI and Bael) 
and 3' 6 base overhangs (Cjel and HaelV) . If the 
5 oligonucleotide system presented here is combined with 
the basic oligonucleotide kit described in Example 3, 
all permutations of 3 1 5 base and 6 base overhangs and 
all permutations of 5 1 4 base and 5 base overhangs can 
be addressed for the trimming procedure. 

10 

In this Example, the location of the binding motifs of 
the initiation linkers is shown below: 



Fokl GGATG 

15 Bs £71 I --GCAGC 

Hgal GACGC 

BplI GAG CTC 

Bael CYATG CA 

Cjel CCA GT 

20 HaelV GAY RTC 

Consensus - -GCAGCGACCATGAGTCCA-CTC- -GTGGATGACGC 

Initiation linkers: 

X=0 : 5 1 - -GCAGCGACCATGAGTCCA-CTC- -GTGGATGPPPPPP 

-25 3 1 - - CGTCGCTGGTACTCAGGT - GAG - - CACCTAC 

X=l : 5 1 - -GCAGCGACCATGAGTCCA-CTC- -GTGGATG- PPPPPP 

3 . _ _ CGTCGCTGGTACTCAGGT - GAG - - CACCTAC - 

X=2 : 5 f - -GCAGCGACCATGAGTCCA-CTC- -GTGGATG- - PPPPPP 

3 . _ _ CGTCGCTGGTACTCAGGT - GAG - - CACCTAC - - 

30 X=3 : 5 ' - -GCAGCGACCATGAGTCCA-CTC- -GTGGATG PPPPPP 

3 1 - - CGTCGCTGGTACTCAGGT - GAG - - CACCTAC 

X=4 : 5 1 - - GCAGCGACCATGAGTCCA-CTC - - GTGGATG ACGCPPPPPP 

3 ' - - CGTCGCTGGTACTCAGGT - GAG - - CACCTACTGCG 

X=5 : 5 1 - -GCAGCGACCATGAGTCCA-CTC- -GTGGATGACGC- PPPPPP 

3 5 3 1 - - CGTCGCTGGTACTCAGGT - GAG - - CACCTACTGCG - 

X=6 : 5 - -GCAGCGACCATGAGTCCA-CTC- -GTGGATGACGC- -PPPPPP 

3 ' - - CGTCGCTGGTACTCAGGT -GAG - - CACCTACTGCG - - 
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X=7 : 5 1 - - GCAGCGACCATGAGTCCA- CTC- -GTGGATGACGC PPPPPP 

3 * - - CGTCGCTGGTACTCAGGT - GAG - - CACCTACTGCG 

X = 8 : 5 1 - - GCAGCGACCATGAGTCCA - CTC - - GTGGATGACGC PPPPPP 

3 1 - - CGTCGCTGGTACTCAGGT - GAG - - CACCTACTGCG 

5 X=9 : 5 1 - -GCAGCGACCATGAGTCCA -CTC- -GTGGATGACGC PPPPPP 

3 ' - - CGTCGCTGGTACTCAGGT -GAG - - CACCTACTGCG 

The 6 base 3 ' overhang PPPPPP is a non-palindromic 
sequence that can be ligated with the complementary 

10 overhang QQQQQQ. The reason 10 different initiation 

linkers are needed is because Bael cuts 10 bases away 
from its binding site. These linkers therefore allow a 
trimming procedure where Bael "jumps" 10 bases for each 
trimming cycle. 10 different start positions will then 

15 be necessary to cover all possibilities. On the other 

side, Hgal cuts only 5 bases away, only necessitating 5 
different start positions. This is the reason the 
binding site for Hgal is not present on X=0 - X=3, 
above . 

20 

Propagation linkers : 

Fokl : 5 1 GGATG 

3 i CCTACNNNN 

J3st71I: 5' GCAGC 

25 ' 3' CGTCGNNNN 

Hgral : 5 ' GACGC 

3 ' CTGCGNNNNN 

Bp! I : 5 1 GAG CTCNNNNN 

3 t CTC GAG 

30 Bael: 5' CCATG CANNNNN 

3 i GGTAC GT 

HaelV : 5' GAC GTCNNNNNN 

31 CTG CTG 

Cjel : 5 ' CCA GTNNNNNN 

35 3« GGT CA 
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Termination linkers : 

The adapters made with the basic oligonucleotides 
described earlier can be used as termination linkers. 
There is therefore no need for a separate set of 
termination linkers . 

Method: 

In this method a trimming reaction using Bstlll that 
will begin on a 3 1 5 base overhang is shown. The target 
DNA is shown below in which the first overhang that will 
be generated is ' marked "*■• . 

**** 

3 ' CACTT **** 

The first Bst71I overhang in the target DNA will be 
located 5-8 bases downstream of the overhang CACTT-3 1 . X 
must therefore be 3 (see the figure below) . The 
following strategy can then be applied: 

One linker is prepared that can address the 3 ' GTGAA 
overhang by annealing 4-3' 6 bases (QQQQQQ) with 3-3' 5 
bases (GTGAA) in one tube: 

GTGAA -3' 

3'- QQQQQQ 

The 3 1 -GAGTGC overhang is then ligated with the X=3 
initiation linker and the GTGAA- 3 1 overhang is ligated 
with the CACTT-3* overhang on the target DNA molecule: 

5 1 - -GCAGCGACCATGAGTCCA-CTC- -GTGGATG PPPPPP 

3 ' - - CGTCGCTGGTACTCAGGT-GAG- - CACCTAC QQQQQQ 

GTGAA 

CACTT 
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EXAMPLE 5 - REMOVAL OF INTERVENING SEQUENCES FROM 
CONSTRUCTS 

In some instances, constructs may be prepared which 
5 contain undesirable nucleic acid sequences between, e.g 
the insert sequence and the vector sequence. Strategies 
for removing the linker sequences should then be 
applied. Illustrated below are some possible strategies 
in which binding sites for restriction enzymes are 
10 provided in the adapter sequences. Cleavage with the 
restriction enzyrties will then result in DNA ends that 
can be religated. The vector DNA is marked as . .VWWW 
while insert DNA is marked as IIIIIII. 

15 Method 1 

Two IIS enzymes that generate 5' -4 base overhangs (Bhsl 
and Esp3 1 ) : 

. . VWWVWGAGC-GAGACG GAAGAC - -GAGCI II IIIIIII 

20 WWWWCTCG-CTCTGC CTTCTG- -CTCGIIIIIIIIII . . 

After cleavage with Bbsl and Esp3I: 

. . WWWW + GAGC - GAGACG GAAGAC- - + 

2 5 WWWWCTCG - CTCTGC CTTCTG - - CTCG 

GAGCIIIIIIIIII 

IIIIIIIIII. . 

30 After ligation with T4 DNA ligase: 

GAGC - GAGACG GAAGAC - + 

- CTCTGC CTTCTG - CTCG 



35 



.VWVWWGAGCIIIIIIIIII 
WWWVVCTCGIIIIIIIIII . 
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Method 2 

One IIS enzyme that generates two 3' 3 base 
overhangs (BsaXI) : 

. .WWWWGAG AC CTCC GAGI 1 1 1 1 1 1 1 1 1 

WWWWCTC TG GAGG CTCI 1 1 1 1 1 1 1 1 1 



After cleavage with BsaXI: 

10 

..WWWWGAG +..' AC CTCC GAG 

WWWW CTC TG GAGG 



+ IIIIIIIIII 
15 CTCIIIIIIIIII. . 

After ligation with T4 DNA ligase: 

AC CTCC GAG + 

20 CTC TG GAGG 

. .WWWWGAGIIIIIIIIII 
WWWWCTCIIIIIIIIII. . 

25 Method 3 

One IIS enzyme that generates blunt ends (Mlyl) 

. .WWWW GAGTC IIIIIIIIII 

WWWW CTGAG IIIIIIIIII . 



30 



35 



After cleavage with Mlyl : r 

. .WWWW + GAGTC - 

WWWW CTGAG 



IIIIIIIIII 
IIIIIIIIII . . 
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After ligation with T4 DNA ligase: 

GAGTC + 

CTGAG 

5 

. . VWVWWIIIIIIIIII 
VWWVWIIIIIIIIII . . 

EXAMPLE 6 - IDENTIFYING OLIGONUCLEOTIDE SETS WITH 6 BASE 
10 PAIR OVERHANGS W ITH MINIMAL MIS -MATCH LIGATIONS 

In order to identify oligonucleotide sets with 6 base 
pair overhangs which are unlikely to form mis-match 
ligations with one another the following steps may be 
15 taken. 

1. Create all 2048 overhang pairs of 6 bases. 

2. Remove the 32 palindromic pairs. 

20 This produces a final set of 2016 overhang pairs. 
PART 1 

1. Take a pair as pair #1 and select the next pair by 
executing section 1. 

/ 25 

Section 1 
Algorithm 1 

Compute the (2016 - n) tables of unweighted mismatch 
scores between the already chosen n pair(s) and all 

30 (2016 - n) remaining pairs, and find among the latter 

the pair(s) for which the lowest score in the table is 
the highest (see below for details about score 
computation) . If there is only one such pair, then 
select it. If there are several pairs, then compute the 

3 5 weighted mismatch scores of the overhang comparisons 
that gave the lowest unweighted score and find the 
pair(s) for which the lowest weighted score is the 
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highest. If there is only one such pair, then select 
it. If there are several pairs, then redo the whole 
procedure using the second lowest unweighted score in 
the mismatch table, then the third lowest, and so on. 
5 If several pairs remain tied after all mismatch scores 
have been considered, keep them all. 

Repeat algorithm 1 for each selected pair and iterate it 
over the desired number of positions to obtain the 

10 chain (s) of overhang pairs. This procedure generates a 
tree with an overhang pair on each branch. The lowest 
unweighted and weighted mismatch scores of the 
particular combination of pairs at each point are 
computed. A particular pathway is stopped (1) when the 

15 desired number of positions is reached, or (2) when the 
combination of pairs is one that has already been found 
earlier, or (3) when the lowest mismatch scores of that 
combination are lower than the lowest scores of the 
complete chain(s) already constructed. Point (3) ensures 

20 that each new complete chain always has lowest mismatch 
scores that are higher than or at least equal to those 
of the previously constructed chain (s) . Note also that, 
as a result of this process, all pairs in a given chain 
are unique and all complete chains in the tree are 
' 25 unique. The whole process terminates when the last 
pathway to be explored stops . Keep the complete 
chain (s) whose lowest mismatch scores are the highest. 

Repeat section 1 starting with each of the 2016 pairs as 
30 pair #1 to produce a set of 2016 overhang chains. Find 
the best chain (s) by applying algorithm 2 

Algorithm 2 

For all chains, compute the tables of unweighted 
35 mismatch scores between all the pairs that are present 

in the chain, and find the chain (s) for which the lowest 
score in the table is the highest (see below for 
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details) . If there is only one such chain, then select 
it. If there are several chains, then compute the 
weighted mismatch scores of the overhang comparisons 
that gave the lowest unweighted score and find the 
chain (s) for which the lowest weighted score is the 
highest. If there is only one such chain, then select 
it. If there are several chains, then redo the whole 
procedure using the second lowest unweighted score in 
the mismatch table, then the third lowest, and so on. 
If several chains remain tied after all mismatch scores 
have been considered, then keep all of them. 

This allows the production of a set of one or more 
overhang chains. 

PART 2 

Take a chain and execute section 2. 

Section 2 
Algorithm 3 

For that chain, find the overhang pair(s) that is (are) 
responsible for the lowest unweighted and weighted 
scores in the table of mismatch scores between all pairs 
in the chain. Then, create new chains by substituting 
that pair with all remaining overhang pairs that are not 
present in the original chain (if there are several 
pairs to be substituted, substitute one pair at a time) . 
From the complete set of newly generated chains and the 
original chain, select one or more chains following 
algorithm 2. Here, including the original chain into 
algorithm 2 ensures that the selected chains always have 
a mismatch score that is higher than or at least equal 
to the score of the original chain. The improvement (if 
any) may involve the lowest or nth lowest unweighted 
score, or the corresponding weighted score. 

Repeat algorithm 3 for each selected chain. This 
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procedure generates a tree with a chain on each branch. 
Each new chain which is added to the tree has a mismatch 
score higher than or equal to the score of the chain 
found in the previous step. A particular pathway is 
stopped when the selected chain is one that has already 
been found earlier. This ensures that all chains in the 
tree are unique. The whole process terminates when the 
last pathway to be explored stops. Keep all the chains 
that are present in the tree. 

Repeat section 2- (i.e., construct a tree) starting with 
each of the chains selected at the end of part 1. 

From the whole set of chains present in all trees, 
select one or more chains following algorithm 2. 

This produces a final set .of one or more overhang 
chains . 

COMPUTATION OF MISMATCH SCORES 
Unweighted score 

The unweighted score for a ligation between two 6 -base 
overhangs is the number of mismatches observed, 
considering the triplets of the first 3 and the last 3 
bases separately. For example, the score for the 
ligation AAAAAC / TTTGCA is 0-3 and the score for 
AAAAAC / TCAGGG is 2-2. All possible scores are ranked 
from highest to lowest according to the order below: 

highest : : 3-3 

3-2/2-3 

2- 2 

3- 1/1-3 

2- 1/1-2 
1-1 

3- 0/0-3 
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2-0/0-2 

lowest : : 1-0/0-1 
Weighted score 

The weighted score (WS) for a ligation is computed as 
follows : 

6 

WS = 6-£ BPS, 
i = l 

where BPS i is the score for the particular base pair at 
site i and is given in the table below: 



AA = 


1 


. 0 


CA = 


0 . 


6 


GA = 


1 . 


0 


TA = 


0 


. 0 


AC = 


0 


. 6 


CC = 


1. 


0 


GC = 


0 . 


0 


TC = 


0 


. 6 


AG = 


1 


. 0 


CG = 


0 . 


0 


GG = 


0 . 


9 


TG = 


0 


.2 


AT = 


0 


. 0 


CT = 


0 . 


6 


GT = 


0 . 


2 


TT = 


0 


. 6 



For the perfect match between an overhang and its 
complement, WS = 6 . 

COMPARISON AMONG PAIRS AND CONSTRUCTION OF TABLES OF 
SCORES 

Finding the next overhang pair 

To select the next overhang pair, tables of mismatch 
scores between the pairs selected at previous positions 
and all remaining pairs are computed. To construct such 
a table, all previously selected pairs are compared with 
the new pair and also every overhang is compared with 
itself. Thus, if n pairs have already been selected, the 
number of ligations considered for each table is 4n + 
2(n+l) = 6n+2 . When comparing two overhangs that are on 
the same DNA strand, one of them is reversed. 

Let us consider the following example where pairs 
AAAAAC / TTTTTG (1A/1B) and AAACGT / TTTGCA (2A/2B) have 
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been chosen previously and the new pair AGTCCC/TCAGGG 
(3A/3B) is tried at the next position: 

The corresponding table is: 

5 



Comparison 


Overhang 


Ligation 


Unweighted 
Score 


Weighted 
Score 


1 vs 1 


1A 
1A 


AAAAAC 
CAAAAA 


3-3 


0 . 8 




IB 
IB 


TTTTTG 
GTTTTT 


3-3 


3 . 2 


2 vs 2 


2A 
2A 


AAACGT 
TGCAAA 


2-2 


2 . 8 




2B 
2B 


TTTGCA 
ACGTTT 


2-2 


4 . 4 


3 vs 3 


3A 
3A 


AGTCCC 
CCCTGA 


2-2 


3 . 6 




3B 
3B 


TCAGGG 
GGGACT 


2-2 


3.6 


1 vs 3 


1A 
3A 


AAAAAC 
CCCTGA 


3-2 


2 . 6 




1A 
3B 


AAAAAC i 
TCAGGG 


2-2 


2.4 




IB 
3A 


TTTTTG 
AGTCCC 


2-2 


4 . 0 




IB 
3B 


TTTTTG 
GGGACT 


3-2 


4 . 6 


2 vs 3 


2A 
3A 


AAACGT 
CCCTGA 


3-2 


2 . 7 




2A 
3B 


AAACGT 
TCAGGG 


2-2 


3 . 3 




2B 
3A 


TTTGCA 
AGTCCC 


2-2 


3.6 
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2B 


TTTGCA 


3-2 


3 . 4 




3B 


GGGACT 







Here, the lowest score is 2-2; 2.4 given by the ligation 
between overhangs 1A and 3B. 

Score table for a chain 

To compute the table of mismatch scores for a chain, all 
overhang pairs contained in the chain are compared with 
each other and also every overhang is compared with 
itself. Thus, for a chain of p overhang pairs, the 
number of ligations considered is 4p(p-l)/2 + 2p = 
2(p2) . As above, one of the two overhangs is reversed 
in the comparison when both are on the same DNA strand. 

For example, let us consider the following 3-pair (i.e., 
4-position) chain: AAAAAC/TTTTTG (1A/1B), AAACGT /TTTGCA 
(2A/2B) , AGTCCC/TCAGGG (3A/3B) in which 1A is on one 
fragment, IB and 2A are on a second fragment, 2B and 3A 
are on a third fragment and 3B is on a fourth fragment. 



The corresponding table is: 



Comparison 


Overhang 


Ligation 


Unweighted 


Weighted 








Score 


Score 


1 vs 1 


1A 


AAAAAC 


3-3 


0 . 8 




1A 


CAAAAA 








IB 


TTTTTG 


3-3 


3.2 




IB 


GTTTTT 






2 vs 2 


2A 


AAACGT 


2-2 


2 . 8 




2A 


TGCAAA 








2B 


TTTGCA 


2-2 


4 . 4 




2B 


ACGTTT 
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3 vs 3 


3A 
3A 


AGTCCC 
CCCTGA 


z - z 


j . 0 




3B 
3B 


TCAGGG 
GGGACT 


2-2 


J . b 


1 vs 2 


1A 
2A 


ti Tl 7\ "A TV f*^ 

AAAAAC 
TGCAAA 


2-3 


1 Q 
1 . O 




1A 
2B 


AAAAAC 
TTTGCA 


0-3 


"3 O 




IB 
2A 


TTTTTG 
AAACGT 


U - J 


c; n 
O . u 




IB 
2B 


TTTTTG 
ACGTTT 


2 - 6 


0 . 0 


1 vs 3 


1A 
3A 


AAAAAC 
CCCTGA 


J - Z 


2 . 0 




1A 
3B 


AAAAAC 
TCAGGG 


2 - Z 


Z . f± 




IB 
3A 


TTTTTG 
AGTCCC 


2-2 


4 . u 




IB 
3B 


TTTTTG 
GGGACT 


3-2 


4 . 0 


2 vs 3 


2A 
3A 


AAACGT 
CCCTGA 


3-2 


2 . / 




2A 
3B 


TCAGGG 


2-2 


3 . 3 




2B 
3A 


TTTGCA 
AGTCCC 


2-2 


3.6 




2B 
3B 


TTTGCA 
GGGACT 


3-2 


3.4 



Here, the lowest score is 0-3; 3.8 given by the ligation 
between overhangs 1A and 2B . 
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Results obtained: 
Table of breaking points 
5 PART 1 



# of 


Unweighted 


Weighted 


# of equal 


positions 


score 


score 


chains 


3 


3-3 


1.6 


48 


4 


2-2 1 


4 . 0 


48 


9 


2-2 


2 .5 


12 


10 


3-1 


3.2 


12 


14 


3-1 


2 .4 


6 


15 


2-1 


4.6 


6 


33 


2-1 


3 . 0 


12 


34 


3-0 


4.6 


12 


90 


3-0 


3.1 





PART 2 



# of 

positions 


Unweighted 
score 


Weighted 
score 


# of equal 
chains 


3 


3-3 


1.6 


48 


4 


3-2 


2.2 


48 


9 


2-2 


2.5 


12 


10 


3-1 


3.2 


12 


14 


3-1 


2.4 


6 


15 


3-1 


2.0 


6 


33 


2-1 


3 . 0 


12 


34 


3-0 


4 . 6 


12 


90 









30 

It will be noted that the unweighted mis-match score (in 
which (9 = 3-3, 8 = 3-2, 7 = 2-2, 6 = 3-1, 5 = 2-1, 4 = 
1-1, 3 = 3-0, 2 = 2-0, 1 = 1-0) reduces as the number of 



WO 01/00816 



PCT/GBOO/02512 



- 67 - 

positions increases . 

Samples of chains obtained at the end o f part 1 and at 
the end of part 2 

5 

3 positions (this chain is obtained at the end of both 
parts) : 

AACTCG/TTGAGC 
TCTCAC/AGAGTG 

10 

4 positions: 
part 1 

AATTGG/TTAACC 
TGCCAC/ACGGTG 
1 5 ATAGTC / TATCAG 
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part 2 

AATGGG / TT AC C C 
TCGGAC/AGCCTG 
TTAACG/AATTGC 

9 positions (this chain is obtained at the end of both 
parts) : 

AATCAC / TTAGTG TACACG / ATGTGC AGGCTG/TCCGAC 

TGAGGG / ACT C C C ACATTC /TGTAAG TTTAGC/AAATCG 

TCGGAT/ AGCCTA GGCTAG / CCGATC 

10 positions (this chain is obtained at the end of both 
parts) : 

AAAACC / TTTTGG AGGCTC/TCCGAG TCGATA/ AGCTAT 

TTGGGG/AACCCC GTCATG/CAGTAC ATTCAG / TAAGTC 

TCATAG/AGTATC TGCAGT / ACGTCA AGAGAT/TCTCTA 



14 positions (this 
parts) : 
ACGTGC / TGCACG 
TATGAG / ATACTC 
TGCACG / ACGTGC 
ATAC AC / TATGTG 
AACTTG/TTGAAC 



chain is obtained at 

GTTGGC / CAACCG 
TTGCGG / AACGCC 
AGT AT C / T C ATAG 
TGACTA/ACTGAT 
ACT CCG / TGAGGC 



the end of both 

TCAGCC / AGTCGG 
AGAGGG / TCTCCC 
CACCGC/ GTGGCG 



15 positions: 
part 1 

AAAACC /TTTTGG 
TTGGGG/AACCCC 
TCATAG/AGTATC 
AGGCTC/TCCGAG 
GTCATG/CAGTAC 



TGCAGT /ACGTCA 
TCGATA/ AGCTAT 
ATT C AG / TAAGT C 
AGAGAT/TCTCTA 
TACTTC / ATGAAG 



AAGTAA/ TTCATT 
CCGTCC/ GGCAGG 
TGTAAC / ACATTG 
ACCGTG/TGGCAC 
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AAAACC/TTTTGG 
TTGGGG/AACCCC 
TCATAG/AGTATC 
AGGCTC/TCCGAG 
GACAAG/ CTGTTC 
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TCTGCT/ AGACGA 
TCGATA/ AGCTAT 
ATT CAG / TAAGTC 
AG AGAT / T CTCTA 
TACTTC/ATGAAG 



AAGTAA / TTCATT 
CCGTCC/GGCAGG 
TGTAAC / ACATTG 
ACCGTG / TGGCAC 



33 positions (this chain is obtained at the end of both 
parts) : 

AACTAG / TTGATC ' ' GTAAGG/CATTCC TCGCCT / AGCGGA 

TGGAGC/ACCTCG AAACTA/ TTTGAT TCTCGG /AGAGCC 

TCAAAT / AGTTTA GTCTCC/CAGAGG ACCCCC/TGGGGG 

CAGGCC/ GTCCGG ACAGCG / TGTCGC TTTTCG /AAAAGC 

TATCAC / ATAGTG CACATC / GTGTAG AAGTCA/ TTCAGT 

AGATTC / TCTAAG TGTGTA/ ACACAT GTTCTC / CAAGAG 

TTCCGT / AAGGCA TAATGC / ATTACG 

CCCACG/ GGGTGC GGTAAG/ CCATTC 

ATGCCG / TACGGC AGTTAT / TCAATA 

TCCGTC/AGGCAG CAACAG / GTTGTC 

CCACGC/ GGTGCG ATCGGC/TAGCCG 

ACTATG / TGATAC AATGCT / TTACGA 

TTAGCA/ AATCGT TTGGAG/AACCTC 



34 positions (this chain is obtained at the end of both 
parts) : 

AACTCT/TTGAGA TTATTC / AATAAG CCAATC /GGTTAG 

TCGAAC/ AGCTTG C ACAAG / GTGTT C ACTTAT/TGAATA 

CAGGGC/GTCCCG TCCGAT/AGGCTA AAAGAG/TTTCTC 

TAAAGG/ATTTCC AGTAGC/TCATCG TTGATA/ AACTAT 

TGTGCG/ ACACGC CCGTCG/GGCAGC AAGACC/TTCTGG 

ATGTAG / TACATC TCACTA/ AGTGAT CAATCC / GTTAGG 

TTCCCC/AAGGGG GTGACG/CACTGC TCTCGC / AGAGCG 

AAT CT C / TTAGAG T G AAAT / ACTTTA AGGGGG/TCCCCC 

TGGCGT/ACCGCA AG CATG / TCGTAC TGCCAG/ACGGTC 

GGCTGC/ CCGACG ACCGTC/TGGCAG TACTAC / ATGATG 
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TTTGAC / AAACTG 
ACAC CG / TGTGGC 
TGAGGC/ACTCCG 



90 positions (this 
1) : 

AAAAAA/ TTTTTT 
CCGGCC/GGCCGG 

10 AGGTAG / TCCATC ' " 

TCCATC/AGGTAG 
ATCTGC/TAGACG 
TAGACG/ atctgc 
ACTGTG / TGACAC 

15 TGACAC /ACTGTG 

CATTAC / GTAATG 
AC C C CA / TGGGGT 
ATGGTA/TACCAT 
CGAAGC/ GCTTCG 

2 0 ATTACC / TAATGG 

TAATGG/ ATTACC 
CTCCTC/GAGGAG 
AGTTGA/ TCAACT 
AATGCT / TTACGA 
25 TTACGA/ AATGCT 

AAGCGC/TTCGCG 
TTCGCG/AAGCGC 
CCCAAG / GGGTTC 
GGGTTC / CCCAAG 

3 0 ACATCC/TGTAGG 

TGTAGG / ACATCC 
AACTTG/TTGAAC 
TTGAAC/ AACTTG 
ATAGAC / TATCTG 
3 5 TATCTG / ATAGAC 

AGACCG/TCTGGC 



chain is obtained at the end of part 



TCTGGC/AGACCG 
ACG CAG / TGCGT C 
TGCGTC / ACGCAG 
AGTCAT / TCAGTA 
TCAGTA/ AGTCAT 
CAG C CG / GTCGG C 
GTCGGC/ CAGCCG 
AATTTC/TTAAAG 
TTAAAG/ AATTTC 
CCAACG/ GGTTGC 
GGTTGC/ CCAACG 
CACCAC / GTGGTG 
AGAATA/ TCTTAT 
TCTTAT/ AGAATA 
ATCAAT/TAGTTA 
TAGTTA/ ATCAAT 
ACTTCA/TGAAGT 
AGCCCC/TCGGGG 
TCGGGG/ AGCCCC 
ACCATG/ TGGTAC 
TGGTAC/ACCATG 
AGGGGA/TCCCCT 
CTAATC/ GATTAG 
CGAGAG/ GCTCTC 
GCTCTC/CGAGAG 
ACACGT/ TGTGCA 
TGTGCA/ ACACGT 
CCTGTC/ GGACAG 
GGACAG/ CCTGTC 



AAACGG / TTTGCC 
TTTGCC / AAACGG 
AACCAA/TTGGTT 
CAAAAC /GTTTTG 
AAGGAA/TTCCTT 
CGCCGC/GCGGCG 
AGTGCG /TCACGC 
TCACGC / AGTGCG 
ATTTTA / TAAAAT 
ATCCTA/TAGGAT 
AGTATC / TCATAG 
T CATAG / AGTATC 
ATGTGG / TACACC 
TACACC /ATGTGG 
ATG CAC / T ACGTG 
TACGTG / ATGCAC 
ACTAAC /TGATTG 
TGATTG /ACTAAC 
CAGTGC /GTCACG 
GTCACG / CAGTGC 
AATAAG / TTATTC 
TTATTC / AATAAG 
AGATAT/TCTATA 
T C TATA / AG AT AT 
AAGTCG /TTCAGC 
TTCAGC / AAGTCG 
AAT CGA / TTAGCT 
TTAGCT / AATCGA 
AGGCTC/TCCGAG 
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TCCGAG/AGGCTC 
CGGGGC/GCCCCG 

5 EXAMPLE 7 - CONSTRUCTION OF A 5 - FRAGMENT CHAIN ENCODING 

THE BINARY SEQUENCE 1-0-1-0-0 

This experiment demonstrates the construction of a 
specific 5 fragment chain using a set of four 
10 non-palindromic 5' 6 base overhang pairs. The set of 
four unique overhang pairs was found using a computer 
program as described in Example 6 . 

Based upon the overhang pairs, a set of five library 
15 components was made by annealing complementary 
oligonucleotides in separate tubes: 
signal 1: 

5 ' -TAATACGACTCACTATACCACAAGTTTGTACAAAAAAGCAGGCTCTATTC-3 ' 
and 5' -TAGGAAGAATAGAGCCTGCTTTTTTGTACAAACTTGTGGTATAGTGA 
2 0 GTCGTATTA- 3 ' ; 

signal 2 : 

5 1 -TTCCTATGCAGTGGACCACTTTGTACAAGAAAGCTGGGTTGCAGT- 3 ' and 
5 1 -GCAACTACTGCAACCCAGCTTTCTTGTACAAAGTGGTCCACTGCA- 3 ' ; 
signal 3 : 

2 5 5 1 - AGTTGCTTGACGCCACAAGTTTGTACAAAAAAGCAGGCTTTGACG - 3 ' and 

5 1 -CGACATCGTCAAAGCCTGCTTTTTTGTACAAACTTGTGGCGTCAA-3 ! ; 
signal 4 : 

5 ' - ATGTCGAAGGGCGGACCACTTTGTACAAGAAAGCTGGGTAAGGGC - 3 ' and 
5 f -GACAGGGCCCTTACCCAGCTTTCTTGTACAAAGTGGTCCGCCCTT - 3 » ; 

3 0 signal 5: 

5 1 -CCTGTCATGTGGACCACTTTGTACAAGAAAGCTGGGTTTCTATAGTGTCACCT 
AAATC-3 1 and 5 1 -GATTTAGGTGACACTATAGAAACCCAGCTTTCTTGTACAA 
AGTGGTCCACAT - 3 f ; 

T7 : 5 * - TAATACGACTCACTATACCA - 3 ' 
3 5 T7 - Cy5 primer : 5 1 - TAATACGACTCACTATA - 3 1 

SP6 primer : 3 ' - AAGATATCACAGTGGATTTAG- 5 1 
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The library components (4 pmol each) were then mixed 
together and ligated using 100 U T4 DNA ligase (NEB) in 
IX ligase buffer at 25°C for 15 minutes. The ligase was 
then inactivated at 65°C for 20 min. 

5/il of the ligation reaction (50^1) was used as template 
in a PCR reaction (5 0/xl) containing IX Thermopol buffer 
(NEB), 0.05 mM dNTPs , 0.4 fiM T7 primer, 0.4 fiM SPG 
primer and 0.04 XJ/fil Vent polymerase (NEB) . The PCR was 
hot started (95°C for 3 minutes before addition of 
polymerase) and cycled 30 times; 95°C, 30 sec; 55°C, 30 
sec; 76 °C, 30 sec, using a PTC-200 thermo cycler (MJ 
Research). 10 /xl of the PCR was analysed on a 1.5% 
agarose gel as shown in Figure 5 . The gel picture showed 
only one intense band corresponding to approximately 240 
bp as expected (243 bp) . The remaining PCR product was 
extracted twice with chloroform and precipitated using 
71% ethanol and 0 . 1M NaAc . The DNA was dissolved in 
water and sequenced. The sequence confirmed that the 
expected signal chain (1-0-1-0-0) was generated. 

EXAMPLE 8 - CON STRUCTION OF A 5X5 FRAGMENT CHAIN 
ENCODING THE BINARY SEQUENCE USING ONE LIGATION CYCLE 
FOLLOWED BY ONE PCT CYCLE OR BY TWO LIGATION CYCLES 

This experiment demonstrates the use of complementary 
primer pairs to link fragment chains together as an 
alternative to the ligation strategy demonstrated in the 
previous example . 

In this experiment 5 fragments chains with 5 positions 
(fragments or bits) each are ligated separately in 
ligation cycle 1 as demonstrated earlier (Example 7) . 
The 5 fragment chains are then amplified with 5 
different primer pairs (pair 1 is used to amplify chain 
1, pair 2 is used to amplify chain 2, etc) . The second 
primer in primer pair 1 is complementary to the first 
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primer in prime pair 2, the second primer in primer pair 
2 is complementary to the first primer in primer pair 3, 
and so on . 

A small aliquot is then taken from each of the 5 PCR 
reactions and a new PCR reactions is performed with 
primers that are specific to the end of signal chain 1 
and 5. The method is illustrated in Figure 6. 

Materials : 

Oligonucleotides are selected which bind to the fragment 
chain and also serve as primers. Thus for example, for 
adjacent chains may be bound using for example the 
following primer pairs: 

fragment chain 2 terminal (with bound primer) : 
TTCTATAGTGTCACCTAAATC 

AAGATATCACAGTGGATTTAGCCTACCAGTACATCCAACGGCAACT 

fragment chain 3 terminal (with bound primer) : 
GTCATGTAGGTTGCCGTTGATCCATCCTAATACGACTCACTATAGCA 

ATTATGCTGAGTGATATCGT 

The above exemplified primer regions are complementary 
and may thus be bound together. 

As an alternative to this method, two ligation cycles 
may be used in which 5 fragment chains (generated by 
ligation)-, are ligated together. Thus, several 
construction cycles to build up long signal chains. 
After the initial ligation in the first ligation cycle 
the 5 fragment chains are then amplified with primers 
containing a Fokl site. The primers are appropriately 
selected such that digestion with Fokl will then make 
non-palindromic overhangs in the end of each fragment 
chain in which the overhang generated in fragment chain 
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1 is able to ligate with the first overhang generated in 
fragment chain 2, the second overhang generated in 
fragment chain 2 is able to ligate with the first 
overhang generated in fragment chain 3, and so on. The 5 
fragment chains can thereby be ligated together in a 
controlled manner to generate a final chain with 25 
fragments (bits) . 

If we want to construct fragment chains with 100 or 500 
fragments we can repeat this procedure 1 or 2 more 
times. The polymerase capacity will, however, be a 
limiting factor regarding how many ligation cycles it is 
possible to perform. Other strategies will therefore 
need to be employed to construct even longer chains. 

EXAMPLE 9: CLONING OF AN INSERT FROM PHIX174 TNTQ PUC1 
WITH A TRIMMED GENE A 

This experiment demonstrates the "trimming" strategy for 
elimination of unwanted flanking sequences. Another 
important aspect of this experiment is that we 
demonstrate that it is possible to link a 5 1 and 3 1 
overhang together with a single stranded oligonucleotide 
alone. It should also be noted that the inserts are 
cloned into two different IIS sites, thereby eliminating 
the problem with insert concatemerisation . 

In this method, Gene A from PhiX174 is cloned into a 
pUC-19 vector. PhiX174 is prepared by cleavage with 
BJbvI, resulting in 15 fragments flanked by different 
non-palindromic 5' 4 bases overhangs, as described in 
more detail in Example 1. The two overhangs adjacent to 
Gene A is then addressed with "initiation linkers" 
containing a BplI site, while the rest of the fragments 
is allowed to religate. T4 DNA ligase, BplI, a 
"propagation linker" containing a BplI site, and two 
"termination adaptors" addressed to the first and last 
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five bases of Gene A respectively are used. The 
solution is incubated at 37°C thereby allowing the 
trimming reaction to succeed until terminated when the 
five first and last bases in Gene A are reached. 

The pUC-19 vector is prepared by cleavage with Hgal and 
Bsal . The overhang generated by Hgal cleavage are 
described in Example 1. Cleavage with Ssal results in 4 
non- identical cleavages giving rise to 8 non- identical 
overhangs, e.g. site 1- GCCA/CGGT (1600) . 

Gene A has the following sequence at its first and last 
five bases (marked by underlining) . 

. . . GCTGGAGGCCTCCACTATGAAATCGCGTAGAG . . . 
. . . CGACCTCCGGAGGTGATACTTTAGCGCATC 

CTGGCGGAAAATGAGAAAATTCGACCTA . . . 

. . . ACGACCGCCTTTTACTCTTTTAAGCTGG 

When terminating the trimming procedure at the 
underlined sequences it is possible to clone Gene A 
without any unwanted flanking base pairs . The 3 1 5 base 
overhangs generated by BplI correspond to the marked 
base pairs . 

The overhang pair generated by Hgal and Bsal in pUC19 
that is used as a cloning site for the gene A from 
PhiX174 is TTCTC/CGGT. 

Method: 

This is as described in Example 1 except that PUC19 is 
cut with both Hgal (NEB 4, 37°C) and thereafter with 
Bsal (NEB 4, B0°C) 
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Materials ; 

Initiation linker 1 (s) : 

5 1 ATT CGG TCG AGA TGC TCT CA3 ' 

5 

Initiator linker 1 (as) : 

5 1 CGA CTG AGA GCA TCT CGA CCG AAT3 1 

Initiation linker 2 (s) : 
10 5'GCG TTA CTG AGC GTA GCT CTG3 ' 

Inititator linker 2 (as) : 

5 ' CTC TCA GAG CTA CGC TCA GTA ACG C3 ' 

15 Propagation linker (s) : 

5 1 TGC TGC AGG AGC GAA TCT CNN NNN3 ■ 

Propagation linker (as) : 

5 1 GAG ATT CGC TCC TGC AGC A3 1 

20 

Labeling linker 2 (s) 

5 1 CTC TTG CTA TAG TGA GTC GTA TTA3 ' 

Labeling linker 2 (as) : 
^25 5 1 TAA TAC GAC TCA CTA TAG CA3 ' 

Termination linker 1 (s) : 

5 ■ AAG AGC TCA GGT CAT TGA CGT AGC TAT GAA3 " 

30 Termination linker 1/2 (as) : 

5 1 AGC TAC GTC AAT GAC CTG AG 3 ' 

Termination linker 1 (short version) : 
5 1 AAG AGA TGA A3 1 

35 

Termination linker 2 (s) : 

5 ' ACC GCT CAG GTC ATT GAC GTA GCT TCA TT3 1 
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Termination linker 2 (short version) : 
5 1 ACC GTC ATT 3 ■ 

The efficiency of the trimming reaction may be accessed 
as follows. Overhang 6) is addressed with a y- 3: P 
labelled adaptor. The trimming reaction is then allowed 
to start from overhang 1) . Aliquots are taken out at 
regularly time intervals and the size distribution of 
the DNA fragments is then analysed on gel. 
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Claims : 

1. A method of attaching a fragment of a first nucleic 
acid molecule to a second nucleic acid molecule, wherein 
5 said method comprises at least the steps: 

1) cleaving said first nucleic acid molecule with a 
nuclease which has a cleavage site separate from its 
recognition site to create at least one fragment of said 
first nucleic acid molecule having a single stranded 

10 nucleotide region (SSla) at at least one terminus of 
said fragment, 

2) if necessary generating a single stranded 
nucleotide region (SS2) at at least one terminus of said 
second nucleic acid molecule, 

15 3) binding to at least one single stranded region of 
step 1) (SSla) an adapter molecule comprising at one 
terminus a single stranded region (SSA1) complementary 
to the single stranded region of said first nucleic acid 
molecule fragment (SSla) and additionally comprising at 

20 the other terminus a further single stranded region 

(SSA2) complementary to the single stranded region (SS2) 
at one terminus of said second nucleic acid molecule, 
4) ligating said adapter to said first nucleic acid 
fragment, 

25 5) binding said adapter to said second nucleic acid 
molecule, and 

6) ligating said adapter to said second nucleic acid 
molecule . 

3 0 2. A method as claimed in claim 1 wherein said first 

nucleic acid molecule fragment has a single stranded 
nucleotide region at either terminus (SSla and SSlb) , 
each of which is bound by an adapter, which may be the 
same or different, and the first of said adapters is 

3 5 bound to said second nucleic acid molecule and the 
second of said adapters binds either to said second 
nucleic acid molecule or to a third nucleic acid 



WO 01/00816 



PCT/GB00/02512 



- 79 - 

molecule . 

3. A method as claimed in claim 2, wherein said 
adapters bind to the termini of said second nucleic acid 

5 molecule, thereby forming a circular nucleic acid 
molecule . 

4 . A method as claimed in any one of claims 1 to 3 , 
wherein said second nucleic acid molecule is a vector or 

10 a fragment thereof and single stranded regions are 

produced in step./ 2) by cleavage of said vector with a 
nuclease . 

5. A method as claimed in an one of claims 1 to 4 , 

15 wherein said adapter molecule additionally comprises one 
or more nuclease recognition and cleave sites. 

6. A method as claimed in any one of claims 1 to 5, 
wherein said nuclease is a restriction enzyme from the 

20 class of IP or IIS enzymes. 

7. A method as claimed in any one of claims 1 to 6 , 
wherein two or more fragments of the first nucleic acid 
molecule are attached to different second and optionally 

/ 25 third' nucleic acid molecules, or different termini 
thereof . 

8. A method as claimed in any one of claims 4 to 7, 
wherein one or more fragments of said first nucleic acid 

3 0 molecule are attached via adapters to single stranded 
regions in said second nucleic acid molecule resulting 
from different cleavage events. 

9. A method as claimed in claim 7 or 8 , wherein one or 
35 more fragments of said first nucleic acid molecule are 

attached via adapters to single stranded regions in two 
or more second nucleic acid molecules. 
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10. A method as claimed in any one of claims 1 to 9 , 
wherein 2 or more first nucleic acid molecules are 
cleaved and bound to one or more second nucleic acid 
molecules by adapter molecules simultaneously in the 
same reaction. 

11. A method as claimed in any one of claims 1 to 10, 
wherein all the steps are conducted together. 

12 . A nucleic acid molecule produced according to a 
method as defined in any one of claims 1 to 11. 

13 . A cloning or expression vector containing the 
nucleic acid molecule as defined in claim 12 . 

14. A eukaryotic or prokaryotic cell or transgenic 
organism containing a vector as defined in claim 13 . 

15. A kit for attaching a first nucleic acid molecule 
fragment to a second nucleic acid molecule or a fragment 
thereof according to the method defined in any one of 
claims 1 to 11 comprising at least (i) one or more 
adapters as described in any one of claims 1 to 9, (ii) 
the second nucleic acid molecule and (iii) a nuclease 
which cleaves outside its recognition site, wherein the 
terminus of one of said adapters has a single stranded 
region complementary to a single stranded region 
generated on said second nucleic acid molecule after 
cleavage with said nuclease. 

16. A method of synthesizing a double stranded nucleic 
acid molecule comprising at least the steps of: 

1) generating n double stranded nucleic acid 
fragments, wherein at least n-2 fragments have single 
stranded regions at both termini and 2 fragments have 
single stranded regions at at least one terminus, 
wherein (n-1) single stranded regions are complementary 
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to (n-1) other single stranded regions, thereby 
producing (n-1) complementary pairs, 

2) contacting said n double stranded nucleic acid 
fragments, simultaneously or consecutively, to effect 

5 binding of said complementary pairs of single stranded 
regions, and 

3) optionally ligating said complementary pairs 
simultaneously or consecutively to produce a nucleic 
acid molecule consisting of n fragments. 

10 

17. A method as claimed in claim 16 wherein said 
fragments are each between 8 and 25 bases in length. 

18. A method as claimed in claim 16 or 17 wherein n is 
15 at least 10. 

19. A method as claimed in any one of claims 16 to 18 
wherein said fragment comprises a region representing a 
unit of information corresponding to one or more code 

20 elements. 

20. A method as claimed in claim 19 wherein said code 
is alphanumeric . 

25 21. A method as claimed in claim 20 wherein said code 
is binary. 

22. A method as claimed in anyone of claims 19 to 21 
wherein each of said one or more code elements has the 
3 0 formula 

(X) a , 
wherein 

X is a nucleotide A, T, G, C or a derivative 
thereof which allows complementary binding and may be 
35 the same or different at each position, and 
a is an integer from 4 to 10, 
wherein (X) a is different for each one or more code 
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elements . 

23. A method as claimed in claim 22, wherein said code 
is binary and the code elements "1" and "0" have the 
5 formulae: 

"0"= (X) a and "1"= (Y) b , 
wherein 

(X) a and (Y) b are not identical, 
10 X and Y are each a nucleotide A, T, G, C or a 

derivative thereof which allows complementary binding 
and may be the same or different at each position, and 
a and b are integers from 4 to 10. 

15 24 . A method as claimed in claim 23 wherein in the 
formulae (X) a and (Y) b/ X and Y are the same at each 
position . 

25. A method of synthesizing a double stranded nucleic 
20 acid molecule comprising at least the steps of: 

1) generating fragment chains according to the method 
defined in any one of claims 16 to 24; 

2) optionally generating single stranded regions at 
the end of said fragment chains, wherein said single 

2 5 stranded regions are complementary to other single 

stranded regions on said fragment chains thus forming 
complementary pairs of single stranded regions; 

3) contacting said fragment chains with one another, 
simultaneously or consecutively, to effect binding of 

30 said complementary pairs of single stranded regions. 

26 . A nucleic acid molecule produced according to a 
method as defined in any one of claims 16 to 25, or a 
single stranded nucleic acid molecule thereof. 



35 



27. A method of identifying the code elements contained 
in a nucleic acid molecule prepared according to a 
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method as defined in any one of claims 16 to 25, wherein 
a probe, carrying a signalling means, specific to one or 
more code elements, is bound to said nucleic acid 
molecule and a signal generated by said signalling means 
5 is detected, whereby said one or more code elements may 

be identified. 

28. A library of fragments as defined in any one of 
claims 16 to 27, comprising (n) m fragments, wherein n is 

10 as defined in any one of claims 16 to 27 and corresponds 
to the length of /chain that said library may produce, 
and m is an integer corresponding to the number of . 
possible code elements or combinations thereof, such 
that fragments corresponding to all possible code 

15 elements for each position in the final chain are 
provided . 
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I. Basis of the report 

1 . With regard to the elements of the international application (Replacement sheets which have been furnished to 
the receiving Office in response to an invitation under Article 14 are referred to in this report as "originally filed" 
and are not annexed to this report since they do not contain amendments (Rules 70. 16 and 70. 17)): 
Description, pages: 
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Claims, No.: 
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□ the language of a translation furnished for the purposes of international preliminary examination (under Rule 
55.2 and/or 55.3). 

3. With regard to any nucleotide and/or amino acid sequence disclosed in the international application, the 
international preliminary examination was carried out on the basis of the sequence listing: 

□ contained in the international application in written form. 

□ filed together with the international application in computer readable form. 
H furnished subsequently to this Authority in written form. 

£3 furnished subsequently to this Authority in computer readable form. 

The statement that the subsequently furnished written sequence listing does not go beyond the disclosure in 
the international application as filed has been furnished. 

H The statement that the information recorded in computer readable form is identical to the written sequence 
listing has been furnished. 

4. The amendments have resulted in the cancellation of: 
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□ the description, pages: 

□ the claims, Nos.: 

□ the drawings, sheets: 

5. □ This report has been established as if (some of) the amendments had not been made, since they have been 

considered to go beyond the disclosure as filed (Rule 70.2(c)): 

(Any replacement sheet containing such amendments must be referred to under item 1 and annexed to this 
report.) 

6. Additional observations, if necessary: 

V. Reasoned statement under Article 35(2) with regard to novelty, inventive step or industrial applicability; 
citations and explanations supporting such statement 

1. Statement 
Novelty (N) 

Inventive step (IS) 

Industrial applicability (IA) 

2. Citations and explanations 
see separate sheet 

VI. Certain documents < 

1. Certain published documents (Rule 70.10) 

and / or 

2. Non-written disclosures (Rule 70.9) 
see separate sheet 

VIII. Certain observations on the international application 

The following observations on the clarity of the claims, description, and drawings or on the question whether the 
claims are fully supported by the description, are made: 
see separate sheet 
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Re Item I 

Basis of the opinion 

Sequence listing pages 1-.23 filed with the letter of 5.9.2000 do not form part of the 
application (Rule 13 ter .1(f) PCT). 

Re Item V 

Reasoned statement under Rule 66.2(a)(ii) with regard to novelty, inventive step or 
industrial applicability; citations and explanations supporting such statement 

D1: VERMERSCH P S ET AL: THE USE OF A SELECTABLE FOK-I CASSETTE IN DNA 
REPLACEMENT MUTAGENESIS OF THE R388 DIHYDROFOLATE REDUCTASE GENE' 
GENE (AMSTERDAM), vol. 54, no. 2-3, 1987, pages 229-238, XP002149816 ISSN: 0378- 
1119 

D2: BRAKE A J ET AL: 'ALPHA-FACTOR-DI RECTED SYNTHESIS AND SECRETION OF 
MATURE FOREIGN PROTEINS IN SACCHAROMYCES-CEREVISIAE' PROCEEDINGS OF 
THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES, vol. 81 , no. 1 5, 1 984, 
pages 4642-4646, XP002149815 1984 ISSN: 0027-8424 

Arts. 33(2)(3) PCT, Novelty and Inventive Step 

D1 and D2 disclose methods of synthesizing double stranded nucleic acid molecules 
using fragments which contain natural, genetic code elements. 

Said documents do, however, not disclose methods of synthesizing a double 
stranded nucleic acid molecule using fragments which comprise units of information 
corresponding to code elements and said code is alphanumeric, binary or have a formula 
as described in claim 3. The cited prior art does also not contain any indication that would 
prompt the skilled person to arrive at such methods. Such methods, molecules produced 
by such methods, and libraries containing said fragments, as described in claims 1-14, are, 
therefore, new and inventive over the cited prior art. 
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Re Item VI 

Certain documents cited, Certain published documents (Rule 70.10) 

Application No Publication date Filing date Priority date (valid claim) 

Patent No (day/month/year) (day/month/year) (day/month/year) 

WO 00/39333 6.7.2000 23.12.1999 13.12.1998 

The above listed document was published and filed after the priority date of the 
present application. It does, therefore, not belong to the state of the art according to Rule 
64(1 )(b) PCT. However, said document claims priority dates earlier than that of the present 
application (28.6.1999). If this priority is valid, the document will become of relevance for 
the novelty of the subject matter of the present application during regional phase 
examination at the EPO. 

Re Item VIII 

Certain observations on the international application 

A "binary code" is a special type of an "alphanumeric code" (see the description of 
the specification, p. 32, In. 31-36; and original claim 21 : "...as claimed in claim 20 wherein 
said code is binary"). Hence, amended claim 2, contains all of the features of claim 1 and 
represents, therefore, a depended claim. Claim 2 is, however, formulated as an 
independent claim and repeats unduly the wording of claim 1 . Claim 2 is, therefore, neither 
concise nor clear (Art. 6 PCT). 
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Claims: 

1- A method of synthesizing a double stranded nucleic 
acid molecule comprising at least the steps of : 
5 1) generating n double stranded nucleic acid fragments, 
wherein at least n-2 fragments have single stranded 
regions at both termini and 2 fragments have single 
stranded regions at at least one terminus, wherein (n-l) 
single stranded regions are complementary to (n-l) other 
10 single stranded regions, thereby producing (n-l) 
complementary pairs , 

2) contacting said n double stranded nucleic acid 
fragments, simultaneously or consecutively, to effect 
binding of said complementary pairs of single stranded 

15 regions, and 

3) optionally ligating said complementary pairs 
simultaneously or consecutively to produce a nucleic 
acid molecule consisting of n fragments, 

wherein said fragment comprises a region representing a 
20 unit of information corresponding to one or more code 
elements and said code is alphanumeric. 

2. A method of synthesizing a double stranded nucleic 
acid molecule comprising at least the steps of: 

25 1) generating n double stranded nucleic acid fragments, 
wherein at least n-2 fragments have single stranded 
regions at both termini and 2 fragments have single 
stranded regions at at least one terminus, wherein (n-l) 
single stranded regions are complementary to (n-l) other 

30 single stranded regions, thereby producing (n-l) 
complementary pairs, 

2) contacting said n double stranded nucleic acid 
fragments, simultaneously or consecutively, to effect 
binding of said complementary pairs of single stranded 

35 regions, and 

3) optionally ligating said complementary pairs 
simultaneously or consecutively to produce a nucleic 
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acid molecule consisting of n fragments, 

wherein said fragment comprises a region representing a 
unit of information corresponding to one or more code 
elements and said code is binary. 

5 

3 . A method of synthesizing a double stranded nucleic 
acid molecule comprising at least the steps of : 

1) generating n double stranded nucleic acid fragments, 
10 wherein at least n-2 fragments have single stranded 

regions at both termini and 2 fragments have single 
stranded regions at at least one terminus , wherein (n-1) 
single stranded regions are complementary to (n-1) other 
single stranded regions, thereby producing (n-1) 
15 complementary pairs , 

2) contacting said n double stranded nucleic acid 
fragments, simultaneously or consecutively/ to effect 
binding of said complementary pairs of single stranded 
regions, and 

20 3) optionally ligating said complementary pairs 

simultaneously or consecutively to produce a nucleic 
acid molecule consisting of n fragments, 

wherein said fragment comprises a region representing a 
unit of information corresponding to one or more code 
25 elements and each of said one or more code elements has. 
the formula 

(X) a , 
wherein 

X is a nucleotide A, T, G, C or a derivative 
30 thereof which allows complementary binding and may be 
the same or different at each position, and 

a is an integer from 4 to 10, 
wherein (X} a is different for each one or more code 
elements . 

35 

4. A method as claimed in claim 3 wherein said code is 
alphanumeric . 
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5 . A method as claimed in claim 3 wherein said code is 
binary. 

6. A method as claimed in claim 5, wherein said code 
5 is binary and the code elements "1" and "0" have the 

formulae: 

"0" = (X)* and "1" = (Y) b , 
wherein 

10 (X) a and (Y) b are not identical/ 

X and Y are each a nucleotide A, T, G, C or a 
derivative thereof which allows complementary binding 
and may be the same or different at each position, and 
a and b are integers from 4 to 10. 

IS 

7. A method as claimed in claim € wherein in the 
formulae (X) a and (Y) b , X and Y are the same at each 
position. 

20 8-. A method as claimed in any one of claims 1 to 7 

wherein said fragments are each between 8 and 25 bases 
in length. 

9. A method as claimed in any one of claims 1 to 8 
25 wherein n is at least 10. 

10 . A method of synthesizing a double stranded nucleic 
acid molecule comprising at least the steps of: 

1) generating fragment chains according to the method 
30 defined in any one of claims 1 to 9; 

2) optionally generating single stranded regions at 
the end of said fragment chains, wherein said single 
stranded regions are complementary to the single 
stranded regions on said fragment chains thus forming 

35 complementary pairs of single stranded regions; 

3) contacting said fragment chains with one another, 
simultaneously or consecutively, to effect binding of 
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said complementary pairs of single stranded regions. 

11. A nucleic acid molecule produced according to a 
method as defined in any one of claims 1 to 10, or a 

5 single stranded nucleic acid molecule thereof. 

12. A method of identifying the code elements contained 
in a nucleic acid molecule prepared according to a 
method as defined in any one of claims 1 to 10, wherein 

10 a probe, carrying a signalling means, specific to one or 
more code elements, is bound to said nucleic acid 
molecule and a signal generated by said signalling means 
is detected, whereby said one or more code elements may 
be identified. 

15 

13 . A library of fragments as defined in any one of 
claims 1 to 12, comprising (n) m fragments, wherein n is 
as defined in any one of claims 1 to 12 and corresponds 
to the length of chain that said library may produce, 

20 and m is an integer corresponding to the number of 

possible code elements or combinations thereof, such 
that fragments corresponding to all possible code 
elements for each position in the final chain are 
provided . 

25 

14 . A kit for synthesizing a double stranded nucleic 
acid molecule comprising a library as defined in claim 
13 and a ligase. 

30 
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EXAMINATION REPORT 



International application No. PCT/G BOO/0251 2 



I. Basis of th report 

1 . With regard to the elements of the international application (Replacement sheets which have been furnished to 
the receiving Office in response to an invitation under Article 14 are referred to in this report as "originally filed" 
and are not annexed to this report since they do not contain amendments (Rules 70. 16 and 70.17)): 
Description, pages: 



1-77 



as originally filed 



Claims, No.: 
1-14 



as received on 



12/10/2001 with letter of 



12/10/2001 



Drawings, sheets: 

1-6 



as originally filed 



Sequence listing part of the description, pages: 

1 -23, filed with the letter of 5.9.2000 

2. With regard to the language, all the elements marked above were available or furnished to this Authority in the 
language in which the international application was filed, unless otherwise indicated under this item. 

These elements were available or furnished to this Authority in the following language: , which is: 

□ the language of a translation furnished for the purposes of the international search (under Rule 23.1 (b)). 

□ the language of publication of the international application (under Rule 48.3(b)). 

□ the language of a translation furnished for the purposes of international preliminary examination (under Rule 
55.2 and/or 55.3). 

3. With regard to any nucleotide and/or amino acid sequence disclosed in the international application, the 
international preliminary examination was carried out on the basis of the sequence listing: 

□ contained in the international application in written form. 

□ filed together with the international application in computer readable form. 
H furnished subsequently to this Authority in written form. 

E9 furnished subsequently to this Authority in computer readable form. 

K The statement that the subsequently furnished written sequence listing does not go beyond the disclosure in 
the international application as filed has been furnished. 

The statement that the information recorded in computer readable form is identical to the written sequence 
listing has been furnished. 

4. The amendments have resulted in the cancellation of: 
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□ the description, pages: 

□ the claims, Nos.: 

□ the drawings, sheets: 

5. □ This report has been established as if (some of) the amendments had not been made, since they have been 
considered to go beyond the disclosure as filed (Rule 70.2(c)): 

(Any replacement sheet containing such amendments must be referred to under item 1 and annexed to this 
report.) 



6. Additional observations, if necessary: 



V. Reasoned statement under Article 35(2) with regard to novelty, inventive step or industrial applicability; 
citations and explanations supporting such statement 

1. Statement 
Novelty (N) 

Inventive step (IS) 

Industrial applicability (IA) 

2. Citations and explanations 
see separate sheet 

VI. Certain documents cited 

1 . Certain published documents (Rule 70.10) 

and / or 

2. Non-written disclosures (Rule 70.9) 
see separate sheet 

VIII. Certain observations on the international application 

The following observations on the clarity of the claims, description, and drawings or on the question whether the 
claims are fully supported by the description, are made: 
see separate sheet 



Yes: Claims 1-14 

No: Claims 

Yes: Claims 1-14 

No: Claims 

Yes: Claims 1-14 

No: Claims 
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EXAMINATION REPORT - SEPARATE SHEET 



R It ml 

Basis of the opinion 

Sequence listing pages 1 -.23 filed with the letter of 5.9.2000 do not form part of the 
application (Rule 13 ,er .1(f) PCT). 

Re Item V 

Reasoned statement under Rule 66.2(a)(ii) with regard to novelty, inventive step or 
industrial applicability; citations and explanations supporting such statement 

D1 : VERMERSCH P S ET AL: THE USE OF A SELECTABLE FOK-I CASSETTE IN DNA 
REPLACEMENT MUTAGENESIS OF THE R388 DIHYDROFOLATE REDUCTASE GENE' 
GENE (AMSTERDAM), vol. 54, no. 2-3, 1987, pages 229-238, XP002149816 ISSN: 0378- 
1119 

D2: BRAKE A J ET AL: ALPHA-FACTOR-DIRECTED SYNTHESIS AND SECRETION OF 
MATURE FOREIGN PROTEINS IN SACCHAROMYCES-CEREVISIAE" PROCEEDINGS OF 
THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES, vol. 81, no. 15, 1984, 
pages 4642-4646, XP002149815 1984 ISSN: 0027-8424 

Arts. 33(2)(3) PCT, Novelty and Inventive Step 

D1 and D2 disclose methods of synthesizing double stranded nucleic acid molecules 
using fragments which contain natural, genetic code elements. 

Said documents do, however, not disclose methods of synthesizing a double 
stranded nucleic acid molecule using fragments which comprise units of information 
corresponding to code elements and said code is alphanumeric, binary or have a formula 
as described in claim 3. The cited prior art does also not contain any indication that would 
prompt the skilled person to arrive at such methods. Such methods, molecules produced 
by such methods, and libraries containing said fragments, as described in claims 1-14, are, 
therefore, new and inventive over the cited prior art. 
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R Item VI 

Certain documents cited, Certain published documents (Rule 70.10) 



Application No 
Patent No 



Publication date 
(day/month/year) 



Filing date 
(day/month/year) 



Priority date (valid claim) 
(day/month/year) 



WO 00/39333 



6.7.2000 



23.12.1999 



13.12.1998 



The above listed document was published and filed after the priority date of the 
present application. It does, therefore, not belong to the state of the art according to Rule 
64(1 )(b) PCT. However, said document claims priority dates earlier than that of the present 
application (28.6.1999). If this priority is valid, the document will become of relevance for 
the novelty of the subject matter of the present application during regional phase 
examination at the EPO. 

Re Item VIII 

Certain observations on the international application 

A "binary code" is a special type of an "alphanumeric code" (see the description of 
the specification, p. 32, In. 31-36; and original claim 21 : "...as claimed in claim 20 wherein 
said code is binary"). Hence, amended claim 2, contains all of the features of claim 1 and 
represents, therefore, a depended claim. Claim 2 is, however, formulated as an 
independent claim and repeats unduly the wording of claim 1 . Claim 2 is, therefore, neither 
concise nor clear (Art. 6 PCT). 
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Claims : 

1. A method of attaching a fragment of a first nucleic 
acid molecule to a second nucleic acid molecule, wherein 
5 said method comprises at least the steps: 

1) cleaving said first nucleic acid molecule with a 
nuclease which has a cleavage site separate from its 
recognition site to create at least one fragment of said 
first nucleic acid molecule having a single stranded 

10 nucleotide region (SSla) at at least one terminus of 
said fragment , ' 

2) if necessary generating a single stranded 
nucleotide region (SS2) at at least one terminus of said 
second nucleic acid molecule, 

15 3) binding to at least one single stranded region of 
step 1) (SSla) an adapter molecule comprising at one 
terminus a single stranded region (SSA1) complementary 
to the single stranded region of said first nucleic acid 
molecule fragment (SSla) and additionally comprising at 

20 the other terminus a further single stranded region 

(SSA2) complementary to the single stranded region (SS2) 
at one terminus of said second nucleic acid molecule, 
4) ligating said adapter to said first nucleic acid 
fragment , 

25 5) binding said adapter to said second nucleic acid 
molecule, and 

6) ligating said adapter to said second nucleic acid 
molecule . 



30 2. A method as claimed in claim 1 wherein said first 
nucleic acid molecule fragment has a single stranded 
nucleotide region at either terminus (SSla and SSlb) , 
each of which is bound by an adapter, which may be the 
same or different, and the first of said adapters is 

35 bound to said second nucleic acid molecule and the 
second of said adapters binds either to said second 
nucleic acid molecule or to a third nucleic acid 
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molecule . 

3. A method as claimed in claim 2, wherein said 
adapters bind to the termini of said second nucleic acid 
molecule, thereby forming a circular nucleic acid 
molecule . 

4. A method as claimed in any one of claims 1 to 3 , 
wherein said second nucleic acid molecule is a vector or 
a fragment thereof and single stranded regions are 
produced in step 2) by cleavage of said vector with a 
nuclease . 

5. A method as claimed in an one of claims 1 to 4, 
wherein said adapter molecule additionally comprises one 
or more nuclease recognition and cleave sites. 

6. A method as claimed in any one of claims 1 to 5 , 
wherein said nuclease is a restriction enzyme from the 
class of IP or IIS enzymes. 

7. A method as claimed in any one of claims 1 to 6 , 
wherein two or more fragments of the first nucleic acid 
molecule are attached to different second and optionally 
third nucleic acid molecules, or different termini 
thereof . 

8. A method as claimed in any one of claims 4 to 7, 
wherein one or more fragments of said first nucleic acid 
molecule are attached via adapters to single stranded 
regions in said second nucleic acid molecule resulting 
from different cleavage events. 

9. A method as claimed in claim 7 or 8, wherein one or 
more fragments of said first nucleic acid molecule are 
attached via adapters to single stranded regions in two 
or more second nucleic acid molecules. 
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10. A method as claimed in any one of claims 1 to 9 , 
wherein 2 or more first nucleic acid molecules are 
cleaved and bound to one or more second nucleic acid 
molecules by adapter molecules simultaneously in the 
same reaction. 

11. A method as claimed in any one of claims 1 to 10, 
wherein all the steps are conducted together. 

12 . A nucleic acid molecule produced according to a 
method as defined in any one of claims 1 to 11. 

13 . A cloning or expression vector containing the 
nucleic acid molecule as defined in claim 12. 

14. A eukaryotic or prokaryotic cell or transgenic 
organism containing a vector as defined in claim 13 . 

IB. A kit for attaching a first nucleic acid molecule 
fragment to a second nucleic acid molecule or a fragment 
thereof according to the method defined in any one of 
claims 1 to 11 comprising at least (i) one or more 
adapters as described in any one of claims 1 to 9, (ii) 
the second nucleic acid molecule and (iii) a nuclease 
which cleaves outside its recognition site, wherein the 
terminus of one of said adapters has a single stranded 
region complementary to a single stranded region 
generated on said second nucleic acid molecule after 
cleavage with said nuclease. 

16. A method of synthesizing a double stranded nucleic 
acid molecule comprising at least the steps of: 
1) generating n double stranded nucleic acid 
fragments, wherein at least n-2 fragments have single 
stranded regions at both termini and 2 fragments have 
single stranded regions at at least one terminus, 
wherein (n-1) single stranded regions are complementary 



10 
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to (n-1) other single stranded regions, thereby- 
producing (n-1) complementary pairs, 

2) contacting said n double stranded nucleic acid 
fragments, simultaneously or consecutively, to effect 
binding of said complementary pairs of single stranded 
regions, and 

3) optionally ligating said complementary pairs 
simultaneously or consecutively to produce a nucleic 
acid molecule consisting of n fragments. 

17. A method as claimed in claim 16 wherein said 
fragments are each between 8 and 2 5 bases in length. 

18. A method as claimed in claim 16 or 17 wherein n is 
15 at least 10. 

19. A method as claimed in any one of claims 16 to 18 
wherein said fragment comprises a region representing a 
unit of information corresponding to one or more code 

20 elements. 

20. A method as claimed in claim 19 wherein said code 
is alphanumeric . 

25 21. A method as claimed in claim 20 wherein said code 
is binary. 

22. A method as claimed in anyone of claims 19 to 21 
wherein each of said one or more code elements has the 
30 formula 

<X) a , 
wherein 

X is a nucleotide A, T, G, C or a derivative 
thereof which allows complementary binding and may be 
35 the same or different at each position, and 

a is an integer from 4 to 10, 
wherein (X) a is different for each one or more code 



WO 01/00816 



PCT/GB00/02512 



- 82 - 

elements . 

23. A method as claimed in claim 22, wherein said code 
is binary and the code elements "1" and "0" have the 
formulae : 

"0"= (X) a and "1"= (Y) b/ 
wherein 

(X) a and (Y) b are not identical, 

X and Y are each a nucleotide A, T, G, C or a 
derivative thereof which allows complementary binding 
and may be the same or different at each position, and 

a and b are integers from 4 to 10. 

24. A method as claimed in claim 23 wherein in the 
formulae (X) a and (Y) b , X and Y are the same at each 
position . 

25 . A method of synthesizing a double stranded nucleic 
acid molecule comprising at least the steps of: 

1) generating fragment chains according to the method 
defined in any one of claims 16 to 24; 

2) optionally generating single stranded regions at 
the end of said fragment chains, wherein said single 
stranded regions are complementary to other single 
stranded regions on said fragment chains thus forming 
complementary pairs of single stranded regions; 

3) contacting said fragment chains with one another, 
simultaneously or consecutively, to effect binding of 
said complementary pairs of single stranded regions . 

26. A nucleic acid molecule produced according to a 
method as defined in any one of claims 16 to 25, or a 
single stranded nucleic acid molecule thereof. 

27. A method of identifying the code elements contained 
in a nucleic acid molecule prepared according to a 
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method as defined in any one of claims 16 to 25, wherein 
a probe, carrying a signalling means, specific to one or 
more code elements, is bound to said nucleic acid 
molecule and a signal generated by said signalling means 
5 is detected, whereby said one or more code elements may 
be identified. 



28. A library of fragments as defined in any one of 
claims 16 to 27, comprising (n) m fragments, wherein n is 

10 as defined in any one of claims 16 to 27 and corresponds 
to the length of chain that said library may produce, 
and m is an integer corresponding to the number of 
possible code elements or combinations thereof, such 
that fragments corresponding to all possible code 

15 elements for each position in the final chain are 
provided . 
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FOR FURTHER ACTION See paragraphs 1 and 4 below 


International application No. 

PCT/GB 00/02512 


International filing date 
(day/month/year) 27/06/2000 


Applicant 

COMPLETE GENOMICS AS 



1 * E Tne a PP ,icant "S hereby notified that the International Search Report has been established and is transmitted herewith. 
Filing of amendments and statement under Article 19: 

The applicant is entitled, if he so wishes, to amend the claims of the International Application (see Rule 46): 

When? The time limit for filing such amendments is normally 2 months from the date of transmittal of the 
International Search Report; however, for more details, see the notes on the accompanying sheet. 



Where? Directly to the 



International Bureau of WIPO 
34, chemin des Colombettes 
1211 Geneva 20, Switzerland 
Fascimife No.: (41-22) 740.14.35 



For more detailed instructions, see the notes on the accompanying sheet. 

2. I I The applicant is hereby notified that no International Search Report will be established and that the declaration under 
1 — 1 Article 1 7(2)(a) to that effect is transmitted herewith. 

3. With regard to the protest against payment of (an) additional fee(s) under Rule 40.2, the applicant is notified that: 

I | the protest together with the decision thereon has been transmitted to the International Bureau together with the 
1 — 1 applicant's request to forward the texts of both the protest and the decision thereon to the designated Offices. 

L J no decision has been made yet on the protest: the applicant will be notified as soon as a decision is made. 

4. Further actlon(s): The applicant is reminded of the following: 

Shortly after 18 months from the priority date, the international application will be published by the International Bureau. 
If the applicant wishes to avoid or postpone publication, a notice of withdrawal of the international application, or of the 
priority claim, must reach the International Bureau as provided in Rules 906/S.1 and 906/S.3. respectively, before the 
completion of the technical preparations for international publication. 

Within 19 months from the priority date, a demand for international preliminary examination must be filed if the applicant 
wishes to postpone the entry into the national phase until 30 months from the priority date (in some Offices even later). 

Within 20 months from the priority date, the applicant must perform the prescribed acts for entry into the national phase 
before all designated Offices which have not been elected in the demand or in a later election within 19 months from the 
priority date or could not be elected because they are not bound by Chapter II. 



Name and mailing address of the International Searching Authority 

European Patent Office, P.B. 5818 Patentfaan 2 
NL-2280 HV Rijswijk 

Tel. (+31-70) 340-2040. Tx. 31 651 epo nl, 
Fax: (+31 -70) 340-301 6 




Authorized officer 

Mireille Claudepierre 
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NOTES TO FORM PCT/ISA/220 



These Notes are intended to give the basic Instructions concerning the fifing of amendments under article 1 9. The 
Notes are based on the requirements of the Patent Cooperation Treaty, the Regulations and the Ackninistrative Instructions 
under that Treaty. In case of discrepancy between these Notes and those requirements, the latter are applicable. For more 
detailed information, see also the PCT Applicant's Guide, a publication of WIPO. 

In these Notes, "Article', "Rule*, and "Section" refer to theprovisions of the PCT, the PCT Regulations and the PCT 
Administrative Instructions respectively. 



INSTRUCTIONS CONCERNING AMENDMENTS UNDER ARTICLE 19 



The applicant has, after having received the international search report, one opportunity to amend the claims of the 
international application. It should however be emphasized that, since ail parts of the international application (claims, 
description and drawings) may be amended during the international preliminary examination procedure, there is usually 
no need to file amendments of the claims under Article 19 except where, e.g. the applicant wants the latter to be published 
for the purposes of provisional protection or has another reason for amending the claims before international pbulication. 
Furthermore, it should be emphasized that provisional protection is available in some States only. 



What pacts of the International application may be amended? 

Under Article 1 9, only the claims may be amended. 

During the international phase, the claims may also be amended (or further amended) under Article 34 before 
the International Preliminary Examining Authority. The description and drawings may only be amended under 
Article 34 before the International Examining Authority. 

Upon entry into the national phase, all parts of the international application may be amended under Article 28 
or, where applicable, Article 41 . 



When? Within 2 months from the date of transmittal of the international search report or 1 6 months from the priority 

date, whichever time limit expires later. It should be noted, however, that the amendments will be considered 
as having been received on time if they are received by the International Bureau after the expiration of the 
applicable time limit but before the completion of the technical preparations for international publication 
(Rule 46.1). 



Where not to file the amendments? 

The amendments may only be filed with the International Bureau and not with the receiving Office or the 
International Searching Authority (Rule 46.2). 

Where a demand for international preliminary examination has been /is filed, see below. 



How? Either by cancelling one or more entire claims, by adding one or more new claims or by amending the text of 

one or more of the claims as filed. 

A replacement sheet must be submitted for each sheet of the claims which, on account of an amendment or 
amendments, differs from the sheet originally filed. 

All the claims appearing on a replacement sheet must be numbered in Arabic numerals. Where a claim is 
cancelled, no renumbering of the other claims is required. In ail cases where claims are renumbered, they must 
be renumbered consecutively (Administrative Instructions, Section 205(b)). 

The amendments must be made In the language In which the International application Is to be published. 



What documents must/may accompany the amendments? 
Letter (Section 205(b)): 

The amendments must be submitted with a letter. 

The letter will not be published with the international application and the amended claims. It should not be 
confused with the "Statement under Article 19(1)" (see below, under "Statement under Article 19(1)"). 

The letter must be In English or French, at the choice of the applicant. However, If the language of the 
international application Is English, the letter must be In English; W the language of the international application 
is French, the letter mu t be in French. 
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NOTES TO FORM PCT/ISA/220 ( ntlnued) 



The letter must indicate the differences between the claims as filed and the claims as amended. It must, in 
particular, indicate, in connection with each claim appearing in the international application (it being understood 
that identical indications concerning several claims may be grouped), whether 

(i) the claim is unchanged; 

(ii) the claim is cancelled; 

(iii) the claim is new; 

(iv) the claim replaces one or more claims as filed; 

(v) the claim is the result of the division of a claim as filed. 



The following examples Illustrate the manner In which amendments must be explained in the 
accompanying letter: 

1 (Where originally there were 48 claims and after amendment of some claims there are 51 J: 

•a aims 1 to 29, 3t , 32, 34. 35, 37 to 48 replaced by amended claims bearing the same numbers; 
claims 30, 33 and 36 unchanged; new claims 49 to 51 added." 

2. (Where originally there were 1 5 claims and after amendment of all claims there are 11]: 
"Claims 1 to 1 5 replaced by amended claims 1 to 1 1 ." 

3. [Where originally there were 14 claims and the amendments consist in cancelling some claims and in adding 
new claims]: 

•Claims 1 to 6 and 14 unchanged; claims 7 to 1 3 cancelled; new claims 1 5, 16 and 17 added/ or 
"Claims 7 to 13 cancelled; new claims 15, 16 and 17 added; all other claims unchanged." 

4. (Where various kinds of amendments are made]: 

"Claims 1-10 unchanged; claims 1 1 to 1 3, 1 8 and 1 9 cancelled; claims 1 4, 1 5 and 1 6 replaced by amended 
claim 1 4; claim 1 7 subdivided into amended claims 1 5, 1 6 and 1 7; new claims 20 and 21 added." 



"Statement under article 19<1) M (Rule 46.4) 

The amendments may be accompanied by a statement explaining the amendments and indicating any impact 
that such amendments might have on the description and the drawings (which cannot be amended under 
Article 19(1)). 

The statement will be published with the international application and the amended claims. 
It must be In the language In which the International application is to be published. 
It must be brief, not exceeding 500 words if in English or if translated into English. 

It should not be confused with and does not replace the letter indicating the differences between the claims 
as filed and as amended. It must be filed on a separate sheet and must be identified as such by a heading, 
preferably by using the words "Statement under Article 1 9(1)." 

It may not contain any disparaging comments on the international search report or the relevance of citations 
contained in that report. Reference to citations, relevant to a given claim, contained in the international search 
report may be made only in connection with an amendment of that claim. 



Consequence If a demand for international preliminary examination has already been filed 

If, at the time of filing any amendments under Article 1 9, a demand for international preliminary examination 
has already been submitted, the applicant must preferably, at the same time of filing the amendments with the 
International Bureau, also file a copy of such amendments with the International Preliminary Examining 
Authority (see Rule 62.2(a), first sentence). 



Consequence with regard to translation of the International application for entry Into the national phase 

The applicants attention is drawn to the fact that, where upon entry into the national phase, a translation of the 
claims as amended under Article 19 may have to be furnished to the designated/elected Offices, instead of, or 
in addition to, the translation of the claims as filed. 

^^urther details on the requirements of each designated/elected Office, see Volume II of the PCT Applicant's 
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INTERNATIONAL SEARCH REPORT 

(PCT Article 18 and Rules 43 and 44) 



Applicant's or agent's file reference 

42.73369 


FOR FURTHER see Notification of Transmittal of International Search Report 
ACTION <F ° rm PCT/,SA/220) as weM as - wnere applicable, item 5 below. 




International application No. 

PCT/GB 00/02512 


International filing date (day/month/year) 

27/06/2000 


(Earliest) Priority Date (day/month/year) 

27/06/1999 




COMPLETE GENOMICS AS 




I^^ terr \ ati ?^f r . S ^l rC I? Report has be '* n P fe P ared b V International Searching Authority and is transmitted to the applicant 
according to Article 1 8. A copy is being .transmitted to the International Bureau. 

This International Search Report consists of a total of 5 shA^t* 

PH 11 ' s also accompanied by a copy of each prior art document cited in this report. 




1. Basis of the report 

a. With regard to the language, the international search was carried out on the basis of the international application in the 
language in which it was filed, unless otherwise indicated under this item. 

Q the international search was carried out on the basis of a translation of the international application furnished to this 
Authority (Rule 23.1(b)). 

b. With regard to any nucleotide and/or amino acid sequence disclosed in the international application, the international search 
was carried out on the basis of the sequence listing : 

I I contained in the international application in written form. 

PI fi,e d together with the international application in computer readable form. 

PH furnished subsequently to this Authority in written form. 

PH furnished subsequently to this Authority in computer readble form. 

|X| the statement that the subsequently furnished written sequence listing does not go beyond the disclosure in the 

international application as filed has been furnished. ; 

[X) the statement that the information recorded in computer readable form is identical to the written sequence listinq has been 
furnished H y 

2. Q Certain claims were found unsearchable (See Box 1). 

3. Q Unity of Invention is lacking (see Box H). 

4. With regard to the title, 

1 1 the text is approved as submitted by the applicant. 

PH foe text has been established by this Authority to read as follows: 

METHODS OF CLONING AND PRODUCING FRAGMENT CHAINS WITH READABLE INFORMATION 
CONTENT 

5. With regard to the abstract, 

PH foe text is approved as submitted by the applicant. 

| | fo^}ext has been established, according to Rule 38.2(b), by this Authority as it appears in Box III. The applicant may, 
within one month from the date of mailing of this international search report, submit comments to this Authority. 

6. The figure of the drawings to be published with the abstract is Figure No. ] 

□ as suggested by the applicant. Q None of the figures. 

PH because the applicant failed to suggest a figure. 

1 1 because this figure better characterizes the invention. 





Form PCT/ISA/210 (first sheet) (July 1998) 



INTERNATIONAL SEARCH REPORT 



^1 



International Application No 

PCT/GB 00/02512 



A CLASSIFICATION OF SUBJECT MATTER 

IPC 7 C12N15/10 C12N15/66 C12Q1/68 
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